Review Rainsonning
Review Rainsonning
Review
a r t i c l e i n f o a b s t r a c t
Article history: Mining valuable hidden knowledge from large-scale data relies on the support of reasoning technology.
Received 30 April 2019 Knowledge graphs, as a new type of knowledge representation, have gained much attention in natural
Revised 8 September 2019
language processing. Knowledge graphs can effectively organize and represent knowledge so that it can
Accepted 10 September 2019
be efficiently utilized in advanced applications. Recently, reasoning over knowledge graphs has become
Available online 13 September 2019
a hot research topic, since it can obtain new knowledge and conclusions from existing data. Herein we
Keywords: review the basic concept and definitions of knowledge reasoning and the methods for reasoning over
Knowledge graph knowledge graphs. Specifically, we dissect the reasoning methods into three categories: rule-based rea-
Reasoning soning, distributed representation-based reasoning and neural network-based reasoning. We also review
Rule-based reasoning the related applications of knowledge graph reasoning, such as knowledge graph completion, question
Distributed representation-based reasoning answering, and recommender systems. Finally, we discuss the remaining challenges and research oppor-
Neural network-based reasoning
tunities for knowledge graph reasoning.
© 2019 Elsevier Ltd. All rights reserved.
Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3. Introduction to knowledge reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.1. Definition of knowledge reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.2. Introduction of leading knowledge graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3. Knowledge reasoning oriented knowledge graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Knowledge reasoning based on logic rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.1. Knowledge reasoning based on first-order predicate logic rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2. Knowledge reasoning based on rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.3. Knowledge reasoning based on ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.4. Knowledge reasoning based on random walk algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5. Knowledge reasoning based on distributed representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.1. Knowledge reasoning based on tensor factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2. Knowledge reasoning based on distance model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.3. Knowledge reasoning based on semantic matching model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4. Knowledge reasoning based on multi-source information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6. Knowledge reasoning based on neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.1. Knowledge reasoning based on convolutional neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.2. Knowledge reasoning based on recurrent neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.3. Knowledge reasoning based on reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7. Application of knowledge graph reasoning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.1. In-KG applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
∗
Corresponding author.
E-mail addresses: [email protected] (X. Chen),
[email protected] (S. Jia), [email protected] (Y. Xiang).
https://doi.org/10.1016/j.eswa.2019.112948
0957-4174/© 2019 Elsevier Ltd. All rights reserved.
2 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
7.1.1. KG Completion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.1.2. Entity classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.2. Out-of-KG applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.2.1. Medical domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.2.2. Internet finance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.2.3. Intelligent question answering system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.2.4. Recommendation systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.2.5. Other applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8. Discussion and research opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8.1. Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.2. Research opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.2.1. Dynamical knowledge reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.2.2. Zero-shot reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.2.3. Multi-source information reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.2.4. Multi-lingual knowledge graph reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Declaration of Competing Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Credit authorship contribution statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Table 1 Table 2
Sources of publication candidates along with the num- Examples of world’s leading knowledge graphs and their
ber of publications in total, after excluding based on statistics (Paulheim, 2017).
the title (I), and finally based on the full text (se-
lected). Works that are found both in a conference’s Knowledge graphs #Entities #Relations #Facts
proceedings and in Google Scholar are only counted WordNet 0.15M 200,000 4.5M
once, as selected for that conference.. Freebase 50M 38,000 3B
YAGO 17M 76 150M
Venue All I Selected
DBpedia (En) 4.8M 2800 176M
Google Scholar Top 300 300 164 66
Wikidata 16M 1673 66M
ACL 39 23 7 NELL 2M 425 120M
EMNLP 54 49 17
NAACL 46 27 5
ISWC 60 42 3
CIKM 45 29 4
AAAI 36 36 17 3.2. Introduction of leading knowledge graphs
IJCAI 41 41 7
ICML 15 14 4 In 2012, Google introduced its Knowledge Graph (Singhal, 2012)
NIPS 34 34 6 project and took advantage of it to improve query result rele-
WWW 30 26 6
ICLR 5 5 3
vancy and users’ search experience. Due to the increasing amount
COLING 9 9 2 of Web resources and release of linked open data (LOD) projects,
Conference 414 335 81 many knowledge graphs have been constructed. In this section, we
All 714 499 147 will present a brief introduction of the world’s leading knowledge
graphs. Table 2 shows examples of leading knowledge graphs and
their statistics.
logism which is the basis of modern deductive reasoning. From the WordNet WordNet is a lexical database for the English lan-
Lambda Calculus which defines computers to various intelligent guage. WordNet was created by the Cognitive Science Laboratory
computing platforms, and from expert system to large-scale knowl- of Princeton University in 1985. Nouns, verbs, adjectives, and ad-
edge graphs, all of which are inseparable from reasoning. With re- verbs are grouped into sets of cognitive synonyms (synsets), each
spect to the basic concepts of knowledge reasoning, academia has expressing a distinct concept. Synsets are interlinked by means of
given different definitions. Zhang and Zhang (1992) pointed out conceptual-semantic and lexical relations, like the IS-A relation be-
that reasoning is the process of analyzing, synthesizing and mak- tween dog and mammal or the PART-WHOLE relation between car
ing decisions on various things, starting from collecting the exist and engine. WordNet has been used for a number of purposes in
facts, discovering interrelationships between things, to developing information systems, including word-sense disambiguation, infor-
new insight. In short, reasoning is the process of drawing conclu- mation retrieval, text classification, text summarization, machine
sions from existing facts by the rules. Kompridis (20 0 0) believed translation, and even crossword puzzle generation. WordNet ver-
that reasoning is a collective term for a range of abilities, including sion 3.0 is the latest version available and contains more than
capacity of understanding things, apply logic, and calibrate or vali- 150,0 0 0 words and 20 0,0 0 0 semantic relations.
date architecture based on existing knowledge. Tari (2013) defined Freebase Freebase is a large collaborative knowledge base con-
the concept of knowledge reasoning as the mechanism behind in- sisting of data composed mainly by its community members. It
ferring new knowledge based on the existing facts and logic rules. was constructed by Metaweb. Freebase contains data harvested
In general, knowledge reasoning is the process of using known from sources such as Wikipedia, NNDB, Fashion Model Directory,
knowledge to infer new knowledge. and MusicBrainz, as well as data contributed by its users. Free-
Early reasoning studies were carried out among scholars in the base’s subjects are called ’topics’, and the data stores about them
fields of logic and knowledge engineering. The scholars of logic ad- depended on their ’type’, types themselves are grouped into ’do-
vocated utilization of formalized methods to describe the objec- mains’. Google’s Knowledge Graph is powered in part by Freebase.
tive world and it believed that all reasoning was based on exist- There are about 3 billion triples currently in Freebase.
ing logical knowledge, such as first-order logic and predicate logic YAGO YAGO (Suchanek, Kasneci, & Weikum, 2007) is an open
(Wu, Han, Li, Zheng, & Chen, 2018). They always focused on how source knowledge base developed by the Max Planck Institute.
to draw correct conclusion from the known propositions and pred- The information in YAGO is extracted from Wikipedia (e.g., cat-
icates. In order to lighten the rigidity of the reasoning process, egories, redirects, infoboxes), WordNet (e.g., synsets, hyponymy),
methods such as non-monotonic reasoning (McCarthy, 1980) and and GeoNames. YAGO combines the clean taxonomy of WordNet
fuzzy reasoning (Zadeh, 1965) were developed for the purpose of with the richness of the Wikipedia category system, assigning the
using it in the more complicated situations. entities to more than 350,0 0 0 classes. YAGO attaches a temporal
Unlike scholars from the Logic field who used propositions or dimension and a spatial dimension to many of its facts and enti-
first-order predicates to represent concepts in the objective world, ties. It extracts and combines entities and facts from 10 Wikipedias
the scholars from the knowledge engineering field used semantic in different languages. Currently, YAGO has knowledge of more
networks to represent richer concepts and knowledge for describ- than 17 million entities (like persons, organizations, cities, etc.) and
ing the relationships between entities and attributes. Nevertheless, contains more than 150 million facts about these entities. YAGO
early knowledge graphs were totally relied on expert knowledge. has been used in the Watson artificial intelligence system.
The entities, attributes, and relationships in knowledge graph were DBpedia DBpedia is a cross-language project aiming to extract
entirely handcrafted by the experts in the fields, such as CyC (Lenat structured content from the information created in the Wikipedia
& Guha, 1989). project. There are more than 45 million interlinks between DB-
With the explosive growth of Internet data scale, traditional pedia and external datasets including Freebase, OpenCyc, etc. DB-
methods based on artificially built knowledge bases (KBs) cannot pedia uses the resource description framework (RDF) to represent
adapt to the need to mine a large amount of knowledge in the era extracted information. The entities of DBpedia are classified in a
of big data. For this reason, data-driven machine reasoning meth- consistent ontology, including persons, places, music albums, films,
ods have gradually become the main stream of knowledge reason- video games, organizations, species, and diseases. DBpedia was
ing research. used as one of the knowledge sources in IBM Watson’s Jeopardy!
4 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
winning system and can be integrated into Amazon Web Services graph construction Akbik and Löser (2012); Banko et al. (2007);
applications. Fader, Soderland, and Etzioni (2011); Kertkeidkachorn and
Wikidata Wikidata is a multilingual, open, linked, structured Ichise (2017); Wu and Weld (2010) and Zhang et al. (2019a).
knowledge base that can be read and edited by both humans So, the rich contents of knowledge bases provide new opportuni-
and machines. It supports more than 280 language versions of ties and challenges for the development of knowledge reasoning
Wikipedia with common source of structured data. Wikidata in- technology. With the popularity of knowledge representation
herits crowdsourcing collaboration mechanism from Wikipedia and learning, neural networks and other technologies, a series of new
also supports editing based on triples. It relies on the notions reasoning methods have been coming out.
of item and statement. An item represents an entity. A state-
ment is composed of one main property-value pair that encodes 4. Knowledge reasoning based on logic rules
the fact like ”taxon name is Pantera Leo” and optional qualifiers
to add information about it like ”taxon author is Carl Linnaeus” Early knowledge reasoning approaches including ontology rea-
(Pellissier Tanon, Vrandečić, Schaffert, Steiner, & Pintscher, 2016). soning have received much attention and produced a series of rea-
NELL Never-Ending Language Learning system (NELL) is a se- soning methods. Furthermore, these methods including predicate
mantic machine learning system that runs 24/7, forever, learning logic reasoning, ontology reasoning, and random walk reasoning,
to read the web. It is developed by a research team at Carnegie can be applied for reasoning over knowledge graphs.
Mellon University. The inputs to NELL include (1) an initial on-
tology defining hundreds of categories and relations that NELL is 4.1. Knowledge reasoning based on first-order predicate logic rules
expected to read about, and (2) 10 to 15 seed examples of each
category and relation. Given these inputs, NELL automatically ex- Reasoning mainly relies on first-order predicate logic rules in
tracts triple facts from the Web. So far, NELL has accumulated over the early stage of statistical relational learning study. First-order
120 million candidate beliefs by reading the web, and it is consid- predicate logic uses propositions as the basic unit for reasoning,
ering these at different levels of confidence, along with hundreds while propositions contain individuals and predictions. Individuals
of learned phrasings, morphological features, and web page struc- that can exist independently correspond to entity objects in the
tures that NELL uses to extract beliefs from the web. knowledge base. They can be a concrete thing or an abstract con-
cept. The predicate is used to describe the nature and things of
3.3. Knowledge reasoning oriented knowledge graph the individual. For example, interpersonal relationships can be rea-
soned using first-order predicate logic by regarding relationships
With the development of knowledge graphs, reasoning over as predicates, characters as variables, and using logical operators
knowledge graphs has also increased a general concern. Referring to express interpersonal relationships, and then setting the logic
to the definition of reasoning, we give the definition of reasoning and constraints of relational reasoning to perform simple reason-
over knowledge graphs as follows: ing. The process of reasoning using first-order predicate logic is
given in the following formula,
Definition 1Knowledge reasoning over KGs:. Given a knowledge
(YaoMing, wasBornIn, Shanghai)∧(Shanghai, locatedIn,
graph KG =< E, R, T > and the relation path P, where E, T represent
China)⇒(YaoMing, nationality, China)
the set of entities, R denotes the set of relations, and the edges in R
First-Order Inductive Learner (FOIL) (Schoenmackers, Etzioni,
link two nodes to form a triple (h, r, t) ∈ T, generating a triplet that
Weld, & Davis, 2010) is a typical work of predicate logic, which
does not exist in the KG G = {(h, r, t )|h ∈ E, r ∈ R, t ∈ T , (h, r, t ) ∈
/
aims to search all the relations in the KG and acquire the Horn
G}.
clauses set of each relation as a feature pattern for predicting
Its goal is to use machine learning methods to infer potential whether the correspondence exists. Finally, the relation discrim-
relations between entity pairs and identify erroneous knowledge ination model is obtained using the machine learning method.
based on existing data automatically with the purpose of comple- There are a large number of related works about FOIL. For ex-
menting KGs. For examples, if the KG contains a fact like (Microsoft, ample, nFOIL and tFOIL (Landwehr, Kersting, & Raedt, 2007) in-
IsBasedIn, Seattle), (Seattle, StateLocatedIn, Washington) and (Wash- tegrate the naïve Bayes learning scheme and a tree augmented
ington, CountryLocatedIn, USA), then we obtain the missing link (Mi- naïve Bayes with FOIL respectively. nFOIL guides the structure
crosoft, HeadquarterLocatedIn, USA). The object of knowledge rea- search by the probabilistic score of naïve Bayes. tFOIL relaxes the
soning is not only the attributes and relations between entities, naïve Bayes assumption to allow additional probabilistic depen-
but also the attribute values of entities and the conceptual hier- dencies between clauses. kFOIL (Landwehr, Passerini, De Raedt, &
archy of ontology. For example, if an entity’s identity card number Frasconi, 2010) combines FOIL’s rule learning algorithm and ker-
attribute is known, the entity’s gender, age, and other attributes nel methods to derive a set of features from a relational rep-
can be obtained through reasoning. resentation. So, FOIL searches relevant clauses that can be used
KG is basically a semantic network and a structured semantic as features in kernel methods. Nakashole, Sozio, Suchanek, and
knowledge base which can formally interpret concepts and their Theobald (2012) present a query-time first-order reasoning ap-
relations in the real world (Xu, Sheng, He, & Wang, 2016). There proach for uncertain RDF knowledge bases with a combination of
is no need for knowledge graph to adopt cumbersome structure soft deduction rules and hard rules. Soft rules are used for deriv-
such as framework (Minsky, 1988) and script (Norenzayan, Smith, ing new facts, while hard rules are used to enforce consistency
Kim, & Nisbett, 2002) in structed expressions, instead of simple constraints among both KG and inferred facts. Galárraga, Teflioudi,
triples with more flexible forms. Therefore, reasoning over knowl- Hose, and Suchanek (2013) propose the AMIE system for min-
edge graph is not limited to traditional reasoning methods based ing Horn rules on a knowledge graph. By applying these rules to
on logic and rules, but also can be diverse. At the same time, the KBs, new facts can be derived for complementing knowledge
knowledge graph consists of instances, which makes the reasoning graphs and detecting errors.
methods more concrete. Traditional FOIL algorithms achieve high inference accuracy
In recent years, researchers have implemented many open in- on small-scale knowledge bases. In addition, experimental results
formation extraction (OIE) systems, such as TextRunner (Banko, Ca- show that the ”entity-relation” association model has strong rea-
farella, Soderland, Broadhead, & Etzioni, 2007), WOE (Wu & soning ability. However, it is difficult to exhaust all inference pat-
Weld, 2010), which greatly expands the data source for knowledge terns due to the complexity and diversity of entities and relations
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 5
in large-scale knowledge graphs. In addition, the high complex- instantiates the rule after manual screening, finally inferring a new
ity and the low efficiency of exhaustive algorithms make orig- relationship instance from other learned relation instances. Spass-
inal FOIL algorithm inappropriate for reasoning over large-scale YAGO expands the knowledge graph by abstracting the triples into
graphs. To solve this problem, Galárraga, Teflioudi, Hose, and equivalent rule classes. Paulheim and Bizer (2014) propose SD-
Suchanek (2015) extend AMIE to AMIE+ by a series of pruning and Type and SDValidate that exploit statistical distributions of prop-
query rewriting techniques for mining even larger KBs. Addition- erties and types for type completion and error detection. SDType
ally, AMIE+ increases the precision of the predictions by consider- uses the statistical distribution of types in the head entity and tail
ing type information and using joint reasoning. Demeester, Rock- entity position of the property for predicting the entities’ types.
täschel, and Riedel (2016b) present a scalable method to incor- SDValidate computes the relative predicate frequency (RPF) for
porate first-order implications into relation representations to im- each statement, with a low RPF value meaning incorrect. Jang and
prove large scale KG inference. While AMIE+ mines a single rule Megawati (2015) present a new approach for evaluating the quality
at a time, Wang and Li (2015) propose a novel rule learning ap- of knowledge graph. They choose the patterns appearing more fre-
proach named RDF2Rules. RDF2Rules mines Frequent Predicate Cy- quently as the generated test patterns for evaluating the quality of
cles (FPCs) to parallelize this process. It is more efficient to deal knowledge graph after analyzing the data patterns. Wang, Mazaitis,
with large-scale KBs than AMIE+ due to a proper pruning strategy. and Cohen (2013) and Wang, Mazaitis, Lao, and Cohen (2015) pro-
To formalize the semantic web and inference efficiently, some pose Programming with Personalized PageRank (ProPPR) for rea-
researchers proposed a tractable language, called description logic soning over a knowledge graph. Reasoning for ProPPR is based on
(DL). Description logic is a crucial foundation for ontology rea- a personalized PageRank process over the proof constructed by
soning that was developed on the basis of propositional logic SLD resolution theorem-prover. Catherine and Cohen (2016) have
and first-order predicate logic. The goal of description logic is shown that ProPPR can be used for performing knowledge graph
to balance representation power and reasoning complexity. It recommendations. They formulate the problem as a probabilis-
can provide well-defined semantics and powerful reasoning tools tic inference and learning task. Cohen (2016) propose TensorLog,
for knowledge graphs and satisfy the needs of ontology con- where inference uses a differentiable process. Inspired by Tensor-
struction, integration and evolution. Therefore, it is an ideal Log, Yang, Yang, and Cohen (2017) describe a framework, neural
ontology language. A KB expressed using a DL is composed logic programming, in which the structure and parameter learn-
of terminological axioms (TBox) and assertional axioms (ABox) ing of logical rules are combined in an end-to-end differentiable
(Lee, Lewicki, Girolami, & Sejnowski, 1999). The TBox is composed model.
of a collection of inclusion assertions stating general properties Rule-based reasoning methods can also combine manually de-
of concepts and roles. For instance, an assertion is the one that fined logic rules with various probability graph models and then
states that a concept denotes a specialization of another concept. obtain new facts by performing knowledge reasoning based on
The ABox consists of assertions on individual objects. The con- the constructed logical network. For example, Jiang, Lowd, and
sistency of the knowledge base is the basic problem in knowl- Dou (2012) propose a Markov logic-based system for cleaning
edge graph reasoning. The complex entity or relation reasoning NELL. This allows knowledge bases to make use of joint probabilis-
in a knowledge graphs can be transformed into a consistency de- tic reasoning, or, applies Markov logic network (MLN) (Richardson
tection problem through TBox and ABox, thus refining and re- & Domingos, 2006) to a web-scale problem. It uses only the on-
alizing knowledge reasoning. Halaschek-Wiener, Parsia, Sirin, and tological constraints and confidence scores of the initial system,
Kalyanpur (2006) present an description logic reasoning algorithm and labelled data. Chen and Wang (2014) present a probabilistic
for complementing knowledge graphs under both the addition knowledge base (ProbKB), which allows an efficient SQL-based in-
and removal of ABoxes assertions. It provides a critical step to- ference algorithm for knowledge completion that applies MLN in-
wards reasoning over fluctuating/streaming data. Calvanese, De Gi- ference rules in batches. Kuželka and Davis (2019) theoretically
acomo, Lembo, Lenzerini, and Rosati (2006) propose the language study the suitability of learning the weights of a Markov logic
EQL based on an epistemic first-order query language, which is network from a KB in the presence of missing data. After learn-
able to reason about incompleteness for querying description logic ing the weights, an MLN could be used to infer additional facts
knowledge graphs. A large number of fuzzy description logics to complete knowledge graphs. However, it is difficult to intro-
are proposed to extend classical description logics with fuzzy ca- duce clause confidence into MLN, because the clause value in logic
pability. Li, Xu, Lu, and Kang (2006) propose a novel discrete rules must be Boolean variables. Moreover, various combinations
tableau algorithm for satisfiability of FSHI knowledge bases with of Boolean variable assignments make learning and reasoning dif-
general TBoxes, which supports a new way to achieve reason- ficult to optimize. To solve this problem, probabilistic soft logic
ing with general TBoxes in fuzzy DLs. Furthermore, Stoilos, Sta- (PSL) (Kimmig, Bach, Broecheler, Huang, & Getoor, 2012) is pro-
mou, Pan, Tzouvaras, and Horrocks (2007) extend DL with fuzzy posed. PSL uses FOIL rules as a template language for graphical
set theory in order to represent knowledge and perform rea- models over random variables with soft truth values ranging in the
soning tasks. To equip description logics for dealing with meta- interval [0,1]. Reasoning in this setting is considered as a continu-
knowledge, Krötzsch, Marx, Ozaki, and Thost (2018) enrich DL con- ous optimization task, which can be handled efficiently. For this
cepts and roles with finite sets of attribute-value pairs, called at- reason, Pujara, Miao, Getoor, and Cohen (2013a) use PSL to rea-
tributed description logics, for knowledge graph reasoning. Existing son candidate facts and their relevant extraction confidences col-
DL reasoners do not provide users explanation services. To address lectively, recognize co-referent entities, and incorporate ontologi-
this problem, Bienvenu, Bourgaux, and Goasdoué (2019) develop a cal constraints. Furthermore, they propose a partitioning technique
framework to equip reasoning system with explanation ability un- (Pujara, Miao, Getoor, & Cohen, 2013b) to reason over large-scale
der inconsistency-tolerant semantics. knowledge graph with considering balancing the reasoning speed
and accuracy. The method first generates a knowledge graph where
4.2. Knowledge reasoning based on rule entities and relations are nodes, ontological constraints are edges.
Then the edge min-cut, a clustering technique, is used to parti-
The basic idea of rule-based knowledge reasoning models is to tion the relations and labels. Finally, it uses PSL to define a joint
reason over KG by applying simple rules or statistical features. The probability distribution over knowledge graphs to accomplish col-
reasoning component of Never-Ending Language Learning system lective reasoning. Bach, Broecheler, Huang, and Getoor (2017) pro-
(NELLs) (Mitchell et al., 2015) learns the probability rule and then pose Hinge-Loss Markov Random Fields (HL-MRFs), which can
6 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
capture relaxed, probabilistic inference with Boolean logic and ex- by this, many researchers have injected path rules into knowl-
act, probabilistic inference with fuzzy logic, making them useful edge reasoning tasks. The path ranking algorithm (PRA) (Lao &
models for both discrete and continuous data. They also introduce Cohen, 2010) is a general technique for performing reasoning in
PSL to make HL-MRFs easy to define and use for large KGs. a graph. To learn an inference model for a particular edge type
in a KB, PRA finds sequences of edge types that frequently link
4.3. Knowledge reasoning based on ontology nodes that are instances of the edge type being predicted. PRA
then use those types as features in a logistic regression model to
Knowledge reasoning over knowledge graphs, which is inti- predict missing edges in the graph. A typical PRA model is com-
mately bound up with ontology languages such as Resource De- posed of three components: feature extraction, feature computa-
scription Framework Schema (RDFS) and Web Ontology Language tion, and relation-specific classification. The first step is to find
(OWL), is closely related to ontology. A knowledge graph can be re- a set of latent valuable path types that link the entity pairs. To
garded as a data structure of knowledge storage. Although it does this end, PRA performs a path constraint random walk over the
not have formal semantics, it can reason by applying RDFS or OWL graph to record those starting from h and ending at t with limited
rules to a KG. Pujara et al. (2013b) have proven that the ontology lengths. The second step is to compute the values in the feature
represented by OWL EL is suitable for being transformed into a matrix by calculating random walk probabilities. Given a node pair
KG and perform reasoning on it efficiently. The reasoning method (h, t), and a path π , PRA computes the feature value as a random
based on ontology mainly uses the more abstract frequent pat- walk probability p(t|h, π ), i.e., the likelihood of reaching t when
terns, constraints or paths to infer. When reasoning through the given a random starting from h and following relations contained
ontology conceptual layer, the concept is mainly described by the in π . It is calculated as follows:
OWL. OWL is able to provide rich statements and is capable of
knowledge representation. p(t |h, π ) = p h, e ; π P t |e ; rl
e ∈range(π )
Zou, Finin, and Chen (2004) propose an inference engine F-OWL
which use a Frame-based system to reason with OWL ontologies. rl (e ,t )
where P t |e ; rl = . Then, the probability of a specific rela-
F-OWL supports consistency checking of the knowledge base, ex- |rl (e ,t )|
tracts hidden knowledge via resolution and supports further com- tion r between an entity pair (h, t) is calculated. The last step is to
plex reasoning by importing rules. Sirin, Parsia, Grau, Kalyanpur, train each relation and obtain the weight of path features using a
and Katz (2007) present the OWL-DL reasoner Pellet to support logistic regression algorithm.
incremental reasoning against dynamic knowledge graphs through The PRA model not only has high accuracy but also significantly
reusing the reasoning results from previous steps to update the improves the computational efficiency, and provides an effective
process incrementally. Chen, Goldberg, Wang, and Johri (2016) pro- solution to solve the problem of reasoning over large-scale knowl-
pose the ontological pathfinding (OP) algorithm that generalizes edge graphs. Lao, Mitchell, and Cohen (2011) have shown that a
to web-scale KBs through a range of optimization and paralleliza- soft reasoning procedure based on a combination of constrained,
tion technologies: a relational KB model to use reasoning rules in weighted, random walks through the KG can be used to reliably
turn, a novel rule mining algorithm to divide the mining tasks predict new beliefs for the KB. They describe a data-driven path-
into smaller sole child tasks, and a pruning strategy to remove finding method, while the original PRA algorithm generates paths
noisy and resource-consuming rules before using them. Wei, Luo, by enumeration. To make PRA applicable to reason on large-scale
and Xie (2016a) propose and implement a distributed knowledge KGs, they modify the path generation procedure in PRA to only
graph reasoning system (KGRL) based on OWL2 RL inference rules. generate paths that are potentially useful for the task. Specifically,
KGRL has a more powerful reasoning ability due to more expres- they demand that a path is contained in the PRA model only if it
sive rules. It can eliminate redundant data and make the reasoning retrieves at least a target entity in the training set, as well as being
result more compact through optimization. In addition, it can also of length less than l, because a small number of possible relation
find the inconsistent data within a knowledge graph. paths are beneficial for inference. Finally, the weighted probability
For reasoning methods based on ontology to be efficient, it is and score of all paths between two entities is a measure of the
important that they are scalable to large-scale knowledge graphs. likelihood that a relation exists between two entities. Furthermore,
Zhou et al. (2006) present a storage and inference system Min- Lao, Subramanya, Pereira, and Cohen (2012) have also shown that
erva for large-scale OWL ontologies. Minerva combines a DL rea- path-constrained random walk models can effectively predict new
soner and a rule engine for ontology inference to improve effi- beliefs when taking advantage of combining a large-scale parsed
ciency. In order to improve the scalability and performance of rea- text corpus and background knowledge. Experimental results show
soning, Soma and Prasanna (2008) propose two methods to par- that the model can infer new beliefs with high accuracy by com-
allelize the inference process for OWL knowledge bases. In the bining syntactic patterns in parsed text and semantic patterns in
data partitioning approach, knowledge graph is partitioned and the the background knowledge.
complete rule-base is applied to each subset of the KG. In the rule- Although the PRA method has good interpretability, one main
base partitioning approach, the rule-base is partitioned and each problem of random walk inference is the feature sparsity. To
node of a parallel system applies one subset of rules to the origi- address this problem, Gardner, Talukdar, Krishnamurthy, and
nal KG. Chen, Chen, Zhang, Chen, and Wu (2013b) present an OWL Mitchell (2014) incorporate vector similarity into random walk
reasoning framework for massive and complex biomedical knowl- inference over KGs, to reduce the feature sparsity inherent us-
edge graph, which takes advantage of MapReduce algorithm and ing surface text. Namely, when following a series of edge types
OWL property chain reasoning method. Recently, Marx, Krötzsch, in a random walk, they permit the walk to follow edges that
and Thost (2017) present a simpler, rule-based fragment of multi- are semantically harmonious to the given edge types, as defined
attributed predicate logic that can be used for ontological reason- by some vector space embedding of the edge types. This com-
ing on a large knowledge graph. bines notions of distributional similarity and symbolic logical in-
ference, resulting in reducing the sparsity of the feature space con-
4.4. Knowledge reasoning based on random walk algorithm structed by PRA. On the one hand, reasoning on the whole knowl-
edge graph is time-consuming, and inference is usually related to
A line of research has proven that incorporating path rules into local information, so inference can be performed locally on the
knowledge reasoning can improve inference performance. Inspired KG. On the other hand, global information is coarser in size and,
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 7
when combined with fine-grained locality information, can im- tern recognition to mine rules or features automatically for train-
prove the accuracy of reasoning. Based on the above two reasons, ing models with machine learning methods. This type of model
Gardner and Mitchell (2015) define a simpler and more efficient represents the knowledge graph as a complex heterogeneous net-
algorithm called subgraph feature extraction (SFE). SFE does only work, so the reasoning tasks can be completed by the trans-
the first step of PRA. They first perform a local search to charac- fer probability, shortest path, and breadth-first search algorithms.
terize the graph around the entity node when some node pairs are However, this representation method has defects yet. First, the
given. Then, they run a set of feature extractors over these local computational complexity of logic rule-based reasoning methods
subgraphs to obtain feature vectors for each entity pair. It greatly is still high, and their scalability is poor. Second, the nodes in
outperforms PRA, not only in time complexity but also in inference the knowledge graph tend to obey the long-tailed distribution,
performance. that is to say, only a few entities and relations have a higher
Liu, Han, Jiang, Liu, and Geng (2017b) study the two potential frequency of occurrence, and most of the entities and relations
problems of the basic assumptions adopted by the existing random appear less frequently. Therefore, sparsity seriously affects the in-
walk models. First, the algorithm extracts the relation path features ference performance. In addition, how to handle multi-hop reason-
through random sampling, which improves the computational ef- ing problem remains a greater challenge for logical models. Con-
ficiency while sacrificing the utilization of existing information in sequently, Lin et al. (2015a) and Das, Neelakantan, Belanger, and
the KG. Second, using the supervised learning method to establish McCallum (2017) restrict the length of paths to 3-steps at most, so
the relational inference model, the effectiveness of the model de- that it can reflect the logical connection between different objects.
pends on the training data, especially those affected by data spar- Therefore, scholars mainly focus on the reasoning methods based
sity. Accordingly, the bidirectional semantics hypothesis and the in- on distributed representation, which is not sensitive to data spar-
ferential of relational-specific graph hypothesis were proposed, and sity and is more expandable.
the two-tier random walk algorithm (TRWA) was designed and im-
plemented. The model is shown in Fig. 1. The main idea of TRWA 5. Knowledge reasoning based on distributed representation
is to combine two different feature modeling methods, subdivide
the topological structure of the KG into global graph and local sub- Previous works to mine and discover unknown knowledge have
graph, and perform feature extraction separately. Finally, weight- relied on logic rules and random walk over the graph for lack
ing and merging the global module and the local module to obtain of parallel corpora. Recently, embedding-based approaches have
complete logic rule inference algorithm. gained much attention in natural language processing. As is shown
A pure random walk without guidence has poor efficiency when in Fig. 2, these models project the entities, relations, and attributes
finding useful formulas, and may even mislead inference due to in- in the semantic network into continuous vector space to get dis-
troduced noise. Although some heuristic rules have been proposed tributed representation. Researchers have proposed a large num-
to guide random walks, they still do not perform well because ber of reasoning methods based on distributed representation, in-
of the variety of formulas. To solve this problem, Wei, Zhao, and cluding tensor decomposition, distance, and semantic matching
Liu (2016b) propose a novel goal-oriented inference algorithm that models.
employs the specific inference target as the direction at each step
in the process of random walk. Specifically, to accomplish such a 5.1. Knowledge reasoning based on tensor factorization
goal-guided mechanism, the algorithm dynamically estimates the
potentials for each neighbour at each step of random walk. There- In the inference process, KG is often represented as a ten-
fore, the algorithm is more inclined to traverse structures that are sor and then is used for inferring unknown facts by tensor de-
helpful in inferring the target and preventing transfer to noisy composition. Tensor decomposition is the process of decomposing
structures. Previous works on PRA usually neglect meaningful asso- high-dimensional arrays into multiple low-dimensional matrices. A
ciations among certain relations, and cannot obtain enough train- three-way tensor X in which two nodes are identically formed by
ing data for less frequent relations. Wang, Liu, Luo, Wang, and the concatenated entities of the domain and the third mode holds
Lin (2016) propose a new multi-task learning framework for PRA, the relations is employed. A tensor entry Xi jk = 1 represents that
called coupled PRA (CPRA). CPRA performs inference using a multi- the fact (ith entity, kth predicate, jth entity) exists. If not, for un-
task mechanism. It consists of two modules: relation clustering known and unseen relations, the entry is set to zero. Then, the
and relation coupling. The former is used to discover highly cor- triplet score is calculated by the vector obtained through factor-
related relations automatically, and the latter is used for coupling ization, and the candidate with the high score is selected as the
the learning of these relations. Through further coupling these re- inference result.
lations, CPRA significantly outperforms PRA in terms of inference The RESCAL model (Nickel, Tresp, & Kriegel, 2011) is a represen-
performance. tative method of the tensor factorization model. Fig. 3 provides an
In general, the trend of knowledge reasoning based on logic illustration of this method. RESCAL decomposes high-dimensional
rules is to abandon the manual rules gradually and then use pat- and multi-relational data into a third-order tensor, which reduces
8 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
Fig. 2. Translations operating on the low-dimensional embeddings of the entities from knowledge graph.
the data dimension and retains the characteristics of the origi- great deal of work has promoted it due to its simplicity and ef-
nal data. It can be used for reasoning over knowledge graphs and ficiency. Structured embedding (SE) method (Bordes, Weston, Col-
achieve better results. Nickel, Tresp, and Kriegel (2012) demon- lobert, & Bengio, 2011) is a simple version of TransE. SE uses two
strate that tensor decomposition in the form of RESCAL factoriza- separate matrices to project head and tail entity for each rela-
tion is a fit reasoning method for the binary relational data of tion and uses topology information of KG to model entities and
the semantic web and demonstrate that the factorization is capa- relations. Since SE models relations with two separate matrices,
ble of successfully reasoning unseen triples on YAGO. Chang, Yih, there is a problem of poor coordination between entities. In ad-
Yang, and Meek (2014) propose a new knowledge inference model, dition, SE performs poorly on large-scale KGs. Therefore, Bordes
TRESCAL, which is highly efficient and scalable. They promote et al. propose a more simplified model called TransE. The model
the tensor decomposition model with two innovations. First, they is inspired by the results in Mikolov, Chen, Corrado, and Dean, in
remove the triples that do not satisfy the relation constraints which the model learns distributed word representations such as
from the loss. Second, they introduce a mathematical technique King − Man ≈ Queen − W oman. TransE model translates the poten-
that significantly reduces the time computational complexity and tial feature representations by a relation-specific offset instead of
space computational complexity. Nickel and Tresp (2013) also ex- transforming them through matrix multiplication. In particular, the
tend the RESCAL tensor factorization based on logistic regression. score function of TransE is defined as:
RESCAL-Logit uses different optimization strategies to improve in-
ference accuracy. In Wu, Zhu, Liao, Zhang, and Lin (2017), PRESCAL f (h, r, t ) = h + r − t l1 /l2
based on paths of tensor factorization is proposed. It employs PRA
to find all paths connecting the source and target nodes. Then,
where · is the l1 or l2 norm of the difference vector. When rea-
these paths are decomposed by tensor factorization for reasoning.
soning is performed, the candidate entity or relation with a small
Jainet al. (2017) develop a novel combination of matrix factoriza-
score is the inference result.
tion (MF) and tensor factorization (TF) for knowledge base infer-
Despite its simplicity and efficiency, TransE cannot deal with
ence. It shows that the inference algorithm works robustly across
One-to-N, N-to-One, and N-to-N relations effectively. For example,
diverse data, and model combination can gain better inference
given an N-to-One relation, e.g., PresidentOf, TransE might learn in-
performance.
distinguishable representations for Trump and Obama, who have
both been president of the United States, although they are com-
5.2. Knowledge reasoning based on distance model pletely different entities. There are similar problems in N-to-One
and N-to-N relations. To overcome the disadvantage of TransE in
TransE (Bordes, Usunier, Garcia-Duran, Weston, & Yakhnenko, dealing with complex relations, a useful idea is to allow an entity
2013) is a commonly used embedding models and is the moti- to have different representations when involved in different rela-
vating base model. Since the time this model was proposed, a tions.
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 9
Fig. 6. Multiple types of entities of relation location (the relation HasPart has at least two latent semantics: composition related as (Table, HasPart, Leg) and location related
as (Atlantic, HasPart, NewYorkBay)) (Ji et al., 2015) .
assumes that temporal ordering relations are relevant to each first represents entities and relations with vectors respectively,
other and evolve dynamically. Know-Evolve (Trivedi, Dai, Wang, & and then models correlations between entities and relations as
Song, 2017) models the non-linear temporal evolution of knowl- semantic matching energy functions. SME defines a linear form
edge components by using a bilinear embedding learning ap- for semantic matching energy functions and also a bilinear form.
proach. They use a deep recurrent architecture to capture dy- Latent factor model (Jenatton, Roux, Bordes, & Obozinski, 2012)
namical characteristics of the entities. Chekol, Pirrò, Schoenfisch, captures various orders of interaction of the data using a bilinear
and Stuckenschmidt (2017) present an MLN-based approach for structure. DistMult (Yang, Yih, He, Gao, & Deng, 2015) simplifies
reasoning over uncertain temporal knowledge graphs. Leblay and RESCAL by restricting Mr to be a diagonal matrix, which reduces
Chekol (2018) try to use side information from the atempo- the number of parameters and shows good reasoning ability and
ral part of the graph for learning temporal embedding. HyTE scalability in terms of validating unseen facts on the existing KB.
(Dasgupta, Ray, & Talukdar, 2018) directly encodes time informa- Nickel, Rosasco, and Poggio (2016b) propose holographic embed-
tion to learn the temporally aware embedding. dings (HolE) to learn compositional vector space representation
of knowledge graphs. HolE applies circular correlation to gener-
5.3. Knowledge reasoning based on semantic matching model ate compositional representations. Through using correlation as the
compositional operator, HolE can capture rich interactions but re-
SE uses two separate matrices to project head and tail enti- mains efficient to reason and easy to train at the same time. The
ties for each relation r, which cannot effectively represent the se- major problem of current representation-based relational inference
mantic connection between entities and relations. Semantic match- models is that they often ignore the semantical diversity of entities
ing energy (SME) (Bordes, Glorot, Weston, & Bengio, 2012; 2014) and relations, which will constrain the reasoning ability. Liu, Han,
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 11
Fig. 7. The matrix factorization framework for learning first-order logic embeddings (Wang & Cohen, 2016).
Yang, Liu, and Wu (2017c) propose a new assumption for relation over large-scale knowledge graphs due to their dependence on log-
reasoning in knowledge graphs, which claims that each relations ical background knowledge. In contrast, distributional representa-
reflects the semantical connection of some specific attention as- tions are efficient and enable generalization. Therefore, injecting
pects of the corresponding entities and could be modelled by se- logic rules into embeddings for inference has received wide atten-
lectively weighting on the constituent of the embeddings to help tion.
alleviate the semantic resolution problem. Accordingly, a semantic Rocktäschel, Bošnjak, Singh, and Riedel (2014) use first-order
aspect-aware relation inference algorithm is proposed that can ef- logic rule to guide entities and relations learning and then per-
fectively improve the accuracy of relation inference on knowledge form logic reasoning. Furthermore, they propose a paradigm for
graphs. learning embeddings of entity pairs and relations that combine
Liu, Wu, and Yang (2017a) study the solutions of knowledge the strengths of matrix factorization and first-order logic domain
inference from the perspective of analogical inference. They for- knowledge (Rocktäschel, Singh, & Riedel, 2015). Two techniques,
mulate analogical structures and leverage them in a scoring func- pre-factorization inference and joint optimization, for injecting log-
tion for optimizing the latent representations of entities and rela- ical background knowledge are presented. For pre-factorization in-
tions. In order to handle a large variety of dyadic relations, includ- ference, they first perform logical inference on the training data
ing symmetric and antisymmetric relations, Trouillon et al. (2017); and infer facts as additional data. They propose a joint objec-
Trouillon, Welbl, Riedel, Gaussier, and Bouchard (2016) propose tive that rewards predictions that satisfy given logical knowledge,
ComplEx, based on complex embeddings. In ComplEx, each entity thus learning embeddings that do not require logical inference
and relation is represented by a complex vector, and the scoring at test time. Demeester, Rocktäschel, and Riedel (2016a) present
function is: a highly efficient method based on matrix factorization for in-
corporating implication rules into distributed representations for
ϕ (r, s, o; θ ) = Re (< wr , es , eo > )
KB inference. In the model, external commonsense knowledge is
K used for relation inference. Wang and Cohen (2016) propose a ma-
= Re wr,k esk eok trix factorization method to learn first-order logic embeddings. An
k=1 overview of the framework is shown in Fig. 7. In detail, they first
= < Re (wr ), Re (es ), Re (eo ) > use ProPPR’s structural gradient method (Wang, Mazaitis, & Cohen,
2014a) to generate a set of inference formulas from knowledge
+ < Re (wr ), Im (es ), Im (eo ) >
graphs. Then, they use this set of formulas, background graphs,
+ < Im (wr ), Re (es ), Im (eo ) > and training examples to generate ProPPR proof graphs. To per-
− < Im (wr ), Im (es ), Re (eo ) > form reasoning on the formulas, they map the training examples
into the rows of a two-dimensional matrix, and inference formu-
where wr ∈ CKis a complex vector, Re(x) means taking the real part las into the columns. Finally, these learned embeddings are trans-
of x and Im(x) means taking the imaginary part of x. The score formed into parameters for the formulas, which makes first-order
function represents the product of conjugate vector of r, s, o, and logic infer with learned formula embeddings. Guo, Wang, Wang,
then retains the real part of the final result. It uses only the Her- Wang, and Guo (2016) propose KALE, a novel method that learns
mitian dot product, as it involves the conjugate-transpose of one entity and relation for reasoning by jointly modelling knowledge
of the two vectors. As a consequence, facts about antisymmetric and logic. KALE consists of three key components: triple mod-
relations can be handled well. elling module, rule modelling module, and joint learning mod-
ule. For triple modelling, they follow TransE to model triples. To
5.4. Knowledge reasoning based on multi-source information model rules, they use t-norm fuzzy logics (Hájek, 2013), which de-
fines the truth value of a complex formula as a composition of
Various auxiliary information, e.g., logical rules, textual descrip- the truth values of its constituents through specific t-norm based
tions and entity types can be combined to further enhance the per- on logical connectives. After unifying triplets and rules as atomic
formance. In this section, we discuss how such information can be and complex formulas, KALE minimizes a global loss to learn en-
integrated. tity and relation embeddings. The larger the truth value is, the
Logic rules can capture the rich semantics of natural language better the ground rules are satisfied. Embedding in this way can
and support complex reasoning but often do worse in reasoning predict new facts that cannot even be directly inferred by pure
12 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
logical inference. Recently, Ho, Stepanova, Gad-Elrab, Kharlamov, poor (Xie, Ma, Dai, & Hovy, 2017). Specifically, the values of entity
and Weikum (2018) have proposed an end-to-end rule learning and relation vectors lack clear physical meaning. Therefore, there
system guided by external sources. It can learn high-quality rules is still a long way for reasoning methods based on distributed rep-
with embedding support. pLogicNet (Qu & Tang, 2019) is proposed resentation to go.
to combine existing rule-based methods and knowledge graph em-
bedding methods. It models the distribution of all possible triplets
6. Knowledge reasoning based on neural network
with a Markov logic network, which is efficiently optimized with
the variational EM algorithm. In the E-step, a knowledge graph em-
As an important machine learning algorithm, neural network
bedding model is used to infer the hidden triplets, whereas in the
basically imitates the human brain for perception and cognition.
M-step, the weights of rules are updated based on the observed
It has been widely used in the fields of natural language pro-
and inferred triplets. Zhang et al. (2019b) propose IterE that learns
cessing and has achieved remarkable results. The neural network
embeddings and rules iteratively at the same time for knowledge
has a strong ability to capture features. It can transform the fea-
graph reasoning.
ture distribution of input data from the original space into another
Multi-source information like textual information and type in-
feature space through nonlinear transformation and automatically
formation, considered as supplements for the structured infor-
learn the feature representation. Therefore, it is suitable for ab-
mation embedded in triples, is significant for inference in KGs.
stract tasks, such as knowledge reasoning.
Wang, Zhang, Feng, and Chen (2014b) introduce a novel method
Neural network has been used for knowledge graph inference
of jointly embedding knowledge graphs and a text corpus so that
for a long time (Nickel, Murphy, Tresp, & Gabrilovich, 2016a). In
entities and words or phrases are represented in the same vec-
SE model, the parameters of the two entity vectors do not interact
tor space. Specifically, they define a coherent probabilistic TransE
with each other. To alleviate the problems of the distance model,
model (pTransE), which consists of three components: the knowl-
Socher, Chen, Manning, and Ng (2013) introduce a single layer
edge model, the text model, and the alignment model. The knowl-
model (SLM) which connects the entity vectors implicitly through
edge model is used for fact modelling, and the alignment model
the nonlinearity of a standard, single layer neural network. SLM
guarantees that the embeddings of entities and words or phrases
can be used for reasoning relations between two entities. How-
lie in the same space and impels two models to enhance each
ever, the non-linearity provides only a weak interaction between
other. Experimental results show that the proposed method is
entity vectors. To this end, Socher et al. (2013) introduce an ex-
very effective in reasoning new facts and capable of analogical
pressive neural tensor network (NTN) for reasoning that is illus-
reasoning. Furthermore, Wang and Li (2016) propose a new text-
trated in Fig. 8. The NTN model replaces a standard linear neu-
enhanced knowledge embedding (TEKE) method by making use of
ral network layer with a bilinear tensor layer that directly relates
rich context information in a text corpus. The rich textual infor-
the two entity vectors across multiple dimensions. NTN initializes
mation is incorporated to expand the semantic structure of the
the representation of each entity by averaging the word vectors,
knowledge graph to better support reasoning. In TEKE, they first
which results in improving performance. Chen, Socher, Manning,
annotate the entities in the corpus and construct a co-occurrence
and Ng (2013a) improve NTN by initializing entity representations
network composed of entities and words to bridge the knowledge
with word vectors learned in an unsupervised manner from text,
graph and text information together. Based on the co-occurrence
and when doing this, existing relations can even be queried for
network, they define the textual contexts for entities and relations
entities that are unseen in the knowledge graphs. The increas-
and incorporate the contexts into the knowledge graph structure.
ing size of knowledge graph and complex feature space make
Finally, a normal translation-based optimization procedure is used
the parameter size of reasoning methods extremely large. (Shi &
for knowledge inference. Experiments on multiple datasets show
Weninger, 2017b) present a shared variable neural network model
that TEKE successfully solves the issue of structure sparseness that
called ProjE, and through a simple change in the architecture,
limits knowledge inference. He, Feng, Zou, and Zhao (2015b) in-
achieves a smaller parameter size. Liu et al. (2016a) propose a new
tegrate different knowledge graphs to infer new facts simultane-
deep learning approach, called neural association model (NAM),
ously. They present two improvements to the quality of reason-
for probabilistic reasoning in artificial intelligence. They investi-
ing over knowledge graphs. First, to reduce the data sparsity, they
gate two NAM structures, namely, deep neural network (DNN) and
utilize the type consistency constraints between relations and en-
relation-modulated neural network (RMNN). In the NAM frame-
tities to initialize negative data in the matrix. Second, they in-
work, all symbolic events are represented in low-dimensional vec-
corporate the similarity of relations between different knowledge
tor space to solve the problem of insufficient representation ability
bases into a matrix factorization model to make use of the com-
faced by existing methods. Experiments on several reasoning tasks
plementarity of diverse knowledge bases. Xie, Liu, Jia, Luan, and
have demonstrated that both DNN and RMNN can outperform con-
Sun (2016) propose a novel method TKRL to take advantage of
ventional methods.
rich information located in hierarchical entity types. They use re-
cursive hierarchical encoder and weighted hierarchical encoder to
construct type-specific projection matrices for entities. Experimen- 6.1. Knowledge reasoning based on convolutional neural networks
tal results show that type information is significant in both pre-
dictive tasks. Tang, Chen, Cui, and Wei (2019) further propose a With the rise of deep learning, attempts are being made to in-
novel model named MKRL to predict potential triples, which inte- troduce deep learning technology into the field of knowledge rea-
grate multi-source information, including entity descriptions, hier- soning (Collobert et al., 2011). Xie et al. (2016) assert that most
archical types, and textual relations. existing translation-based inference methods concentrate only on
Generally, representation learning develops rapidly, and it has the structural information between entities, regardless of rich in-
shown great potential in knowledge representation and reason- formation encoded in entity description. For example, the phrase
ing over large-scale knowledge graphs. Knowledge representation Yao Ming is a famous basketball player in China that contains the
learning can effectively solve the issue of data sparseness, and nationality information and occupational information of the entity
the efficiency in knowledge reasoning and semantic computing is Yao Ming simultaneously, and these multi-source heterogeneous
higher than that of logic-based model. Based on TransE model, information can be used for handling the problem of data spar-
a number of improved knowledge graph inference methods have sity effectively and enhancing the ability of distinguishing between
been proposed. However, the interpretability of these methods is entities and relation. Accordingly, they propose a novel method
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 13
for knowledge inference, named description-embodied knowledge (Schlichtkrull et al., 2018), translates concatenated head and tail
representation learning (DKRL), which is able to make use of both embeddings using the relation embedding of size 2d, which can
fact triples and entity description. DKRL uses two encoders to achieve competitive performance on different datasets.
represent semantics of entity descriptions, including a continu-
ous bag-of-words (CBOW) model and a deep convolutional neural 6.2. Knowledge reasoning based on recurrent neural network
model, which can reduce the effect of data sparsity on the perfor-
mance of inference models. DKRL also takes the zero-shot scenario Knowledge reasoning techniques that fuse relation paths and
into consideration, in which knowledge graphs contain some novel neural networks are also worth exploring. Neelakantan, Roth, and
entities with only descriptions. It can learn representation for these McCallum (2015) propose an approach composing the implications
novel entities automatically from their descriptions. The experi- of a path using a recurrent neural network (RNN) called Path-
ments in the zero-shot scenario show that the DKRL model can RNN that reasons about conjunctions of multi-hop relations non-
still achieve favourable results on the reasoning tasks. In order to atomically. Path-RNN uses PRA to find distinct paths for each re-
infer new entities out of knowledge graph, (Shi & Weninger, 2017a) lation type and then takes embeddings of binary relation in the
further propose a new open-world KGC task and introduce a model path as inputs vector. It outputs a vector in the semantic neigh-
called ConMask to solve this task. It uses a relationship-dependent bourhood of the relation between the first and last entity of the
content masking to highlight words that are relevant to the task path. For example, as shown in Fig. 10, after consuming the rela-
and then trains a fully convolutional neural network (FCN) for tar- tion vectors along the path Microsoft → Seattle → Washington →
get fusion. Experiments on both open-world datasets and closed- USA, Path-RNN produces a vector semantically close to the relation
world datasets show that ConMask can achieve good performance. CountryofHeadquarters. Shen, Huang, Chang, and Gao (2016) pro-
However, these reasoning methods ignore the rich attribute infor- pose Implicit ReasoNets (IRNs) that learns to traverse knowledge
mation in the knowledge graph, such as age and gender can char- graphs in vector space and infer missing triples. Rather than us-
acterize entities in the knowledge graph. To this end, (Tay, Tuan, ing human-designed relation paths in symbolic space and training
Phan, & Hui, 2017) first propose a novel multi-task neural network a model separately, they propose to learn relation paths in vec-
(MT-KGNN), which learns representations of entities, relations and tor space jointly with model training without using any additional
attributes by encoding attribute information in the process of rea- information. Implicit ReasoNets also provides ways to understand
soning. MT-KGNN consists of RelNet and AttrNet. RelNet models the inference process. Das et al. (2017) note that the Path-RNN
the structure and relation of knowledge graph, while AttrNet mod- model has three defects: (1) It reasons about chains of relations,
els entities and corresponding properties. Notably, it is necessary but not the entities that make up the nodes of the path. (2) It
to predefine relation and attribute. Otherwise, a large number of takes only a single path as evidence in predicting new predictions.
invalid calculations will occur and seriously affect the inference ac- (3) Path-RNN makes it impractical to be used in downstream tasks,
curacy. since it requires training and maintaining a model for each rela-
Annervaz, Chowdhury, and Dukkipati (2018) introduce a tion type. Therefore, they present Single-Model which shares the
convolution-based model for knowledge inference. First, they use relation type representation and the composition matrices of the
the DKRL encoding scheme, as it emphasizes the semantic descrip- recurrent neural network across all target relations, enabling the
tion of the text. Afterward, entity and relation vectors are com- same training data to be represented by a reduced number of pa-
puted by the weighted sum with the attention mechanism. Exper- rameters. The Single-Model significantly increases the accuracy and
iments show significant improvement in performance on the nat- practicality of RNN-based reasoning on Horn clause chains in large-
ural language inference (SNLI) dataset. Dettmers, Minervini, Stene- scale KBs. Wang, Li, Zeng, and Chen (2018c) introduce an attention
torp, and Riedel (2018) propose ConvE, a multi-layer convolutional mechanism for the multi-hop reasoning problem. After finding rea-
network model for knowledge inference, which can scale to large soning paths between entities, they aggregate these paths’ embed-
knowledge graphs. The architecture of ConvE is illustrated in Fig. 9. dings into one according to their attentions, and infer the relation
In ConvE, embedding representation of (s, r) pair is converted based on the combined embedding.
into a matrix and is regarded as a picture for extracting features Triples are not natural language. They model the complex struc-
with a convolution kernel. Unlike other inference methods, they ture with a fixed expression (h, r, t). Such short sequences may
use 1-N scoring to increase convergence speed. Ravishankar, Taluk- be under-representative to provide enough information for infer-
dar et al. (2017) observe that using a predefined scoring func- ence. Meanwhile, it is costly and difficult to construct useful long
tion, as in ConvE, might not perform well across all datasets. sequences from massive paths. It is inappropriate to treat them as
They define a simple neural network based score function ER- the same type. To solve the above problems, Guo, Zhang, Ge, Hu,
MLP-2d to fit different datasets. ER-MLP-2d, a variant of ER-MLP and Qu (2018) propose DSKG that employs respective multi-layer
14 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
model. The graphic model regards the paths as discrete latent vari- precision has fallen to 0.71. The underlying reason is that the ex-
ables and relation as the observed variables with a given entity traction patterns are not perfectly reliable, so false instances are
pair as the condition, thus, the path-finding module can be viewed sometimes extracted. The false instances will be used to extract
as a prior distribution to predict the potential links in the knowl- increasing numbers of unreliable extraction patterns and false in-
edge graphs. In contrast, the path-reasoning module can be re- stances and finally dominate the knowledge base. NELL uses peri-
garded as the likelihood distribution, which categorizes potential odic human supervision to alleviate incorrect triples. However, hu-
links into multiple classes. With this assumption, they introduce man supervision is very expensive. Thus, knowledge graph reason-
an approximate posterior and design a variational auto-encoder ing methods are required to clean a noisy knowledge base auto-
(Kingma & Welling, 2014) algorithm to maximize the evidence matically.
lower-bound. This variational framework unifies two modules into
a unified framework and jointly train them. By active co-operations 7.1.2. Entity classification
and interactions, the path finder can take the value of searched Entity classification aims to determine the categories (e.g., per-
path into account and resort to more useful paths. Meanwhile, the son, location) of a certain entity, e.g., BarackObama is a person, and
path reasoner module can get more various paths from the path Hawaii is a location. It can be treated as a special entity prediction
finder and generalizes better to unseen scenarios. Lin, Socher, and task for the reason that the relation encoding entity types (denoted
Xiong (2018) propose two modelling improvements for RL-based as IsA) is contained in the KG and has already been included into
knowledge graph reasoning: reward shaping and action dropout. the embedding process. Thus, entity classification is obviously a KG
Reward shaping combines capability in modelling the semantics of completion problem.
triples with the symbolic reasoning capability of the path-based
approach. Hard action dropout is more effective in encouraging the 7.2. Out-of-KG applications
policy to sample various paths.
Reasoning methods based neural network attempt to use the 7.2.1. Medical domain
powerful learning ability of neural network to represent the triples At present, the medical domain has become a domain where
in knowledge graphs and thereby obtain better reasoning ability. knowledge graphs are actively used, and it is also a research focus
However, model interpretability of neural network still exists in the in the artificial intelligence. When applied to medical knowledge
area of knowledge graph reasoning, and how to explain the rea- graphs, knowledge reasoning methods can help doctors to collect
soning ability of neural network is worth studying. To date, there health data, diagnose disease, and control errors (Yuan et al., 2018).
has been little research on reasoning methods based on neural net- For example, Kumar, Singh, and Sanyal (2009) propose a hybrid
works. However, its powerful representation ability and outstand- method based on case-based reasoning and rule-based reasoning
ing performance showing in other fields promise broad prospects. to build a clinical decision support system for an intensive care
In the future, it is worth exploring how to extend existing neural unit (ICU). García-Crespo, Rodríguez, Mencke, Gómez-Berbís, and
network methods to the filed of knowledge graph reasoning. Colomo-Palacios (2010) design an ontology-driven differential di-
agnosis system (ODDIN), which is based on logical inference and
7. Application of knowledge graph reasoning probabilistic refinements. Martínez-Romero et al. (2013) build an
ontology-based system for intelligent supervision and treatment
Knowledge graph reasoning methods infer unknown relations of critical patients with acute cardiac disorders, where the ex-
from existing triples, which not only provides efficient correla- pert’s knowledge is represented by OWL ontology and a set of
tion discovery ability for resources in large-scale heterogeneous SWRL rules. On the basis of this knowledge, the inference en-
knowledge graphs but also completes knowledge graphs. Tech- gine executes the reasoning process and provides a recommenda-
niques such as consistency inference ensure the consistency and tion about the patient’s treatment for the doctor. Ruan, Sun, Wang,
integrity of the knowledge graph. Inference techniques can perform Fang, and Yin (2016) convert the data stored in traditional Chinese
domain knowledge reasoning through modelling domain knowl- medicine knowledge graph into inference rules and then combine
edge and rules, which can support automatic decision making, data them with patient data for ancillary prescriptions inferred based
mining and link prediction. Due to the powerful intelligent reason- on the knowledge graph.
ing ability, knowledge graphs can be widely used in many down- Even for the same disease, the doctor may make different di-
stream tasks. In this section, we categorize these tasks into In-KG agnoses according to the patient’s condition because of the med-
applications and Out-of-KG applications, described as follows. ical domain’s dependence on subjective judgment. Thus, medical
knowledge graphs must address a large amount of repetitive con-
7.1. In-KG applications tradictory information, which increases the complexity of the med-
ical reasoning model. Although traditional knowledge reasoning
7.1.1. KG Completion methods promote the automatic medical diagnosis process, they
Constructing a large-scale knowledge graph requires constant also have the defects of insufficient learning ability and low data
updating relations between entities. However, despite their seem- utilization rates. In the face of increasing medical data, it is in-
ingly immense size, these knowledge bases are missing substantial evitable that some information will be missing and the diagnosis
amounts of information. For example, over 70% of people included will be too time consuming. In order to solve the above problems,
in Freebase have no known place of birth, and 99% have no known we need to explore and study efficient medical reasoning models.
ethnicity (West et al., 2014). One way to fill in missing facts in a
knowledge base is to infer unknown facts based on existing triples, 7.2.2. Internet finance
which is called knowledge graph completion, also known as link Finance is also one active area in which knowledge graphs
prediction (Liu, Sun, Lin, & Xie, 2016b). have been used. The investment relationship and the employ-
Due to the noisy data source and the inaccuracy of the extrac- ment relationship in the knowledge graph can be used to iden-
tion process, noisy knowledge and knowledge contradictions phe- tify stakeholder groups through a clustering algorithm. When some
nomena in the knowledge graph also exist (Dong et al., 2014). A of the nodes have changed or large events occur, associations
major problem of NELL is that the accuracy of the knowledge it ac- between changed entities can be inferred by path sorting and
quires gradually decreases as it continues to operate. After the first subgraph discovery methods. In the finance industry, anti-fraud
month, NELL has an estimated precision of 0.9; after two months, is an important task. Through knowledge inference, people can
16 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
verify the consistency of information to identify fraud in advance 7.2.4. Recommendation systems
(Kapetanakis, Samakovitis, Gunasekera, & Petridis, 2012). In addi- The recommendation systems based on knowledge graph con-
tion, knowledge inference also plays an important role in the field nect user and items, which can integrate multiple data sources
of securities investment (He, Ni, Cao, & Ma, 2016). For example, to enrich semantic information. Implicit information can be ob-
Ding, Zhang, Liu, and Duan (2016) propose a joint model to com- tained through reasoning techniques to improve recommendation
bine knowledge graph information and event embedding for stock accuracy. There are several typical cases for recommendation based
prediction. However, this work doesn’t capture the structural in- on knowledge graph reasoning methods, such as shopping recom-
formation in the text, and these information is very important for mendation, movie recommendation and music recommendation.
affecting stock to increase or decrease. Therefore, Liu, Zeng, Yang, Wang et al. (2018a) propose knowledge-aware path recurrent net-
and Carrio (2018) propose a joint learning model of tuple and texts work (KPRN), which not only generates representations for paths
using the TransE model and a convolution neural network to cap- by accounting for both entities and relations but also performs
ture structured information in event tuple. The predictive results reasoning based on paths to infer user preference. Unlike exist-
can support business decisions and improve investment planning. ing approaches that focus only on leveraging knowledge graphs for
Knowledge graph reasoning has improved the efficiency of re- more accurate recommendation, Xian, Fu, Muthukrishnan, de Melo,
source allocation in the finance industry, strengthened the abilities and Zhang (2019) propose a policy-guided path reasoning (PGPR)
of risk management and control, and effectively promoted the de- method, which can reason over knowledge graph for recommenda-
velopment of the financial industry. However, current data analy- tion with interpretation. PGPR is a flexible graph reasoning frame-
sis and reasoning methods are difficult to meet the requirements work and can be extended to many other graph-based tasks such
of large-scale data analysis due to low standardization of finance as product search and social recommendation.
industry data and its dispersion in multiple data systems. In re- With the help of reasoning techniques, it is possible to use
sponse to this problem, external knowledge bases should be intro- multi-source heterogeneous data in recommendation systems.
duced to achieve reasoning over cross-domain large-scale knowl- However, it is still in the initial development stage, and faces many
edge graphs. challenges. In the future, how to solve the cold start issues and ex-
plicit reasoning over knowledge for recommendation systems are
7.2.3. Intelligent question answering system worth exploring.
KB-based question and answering (KBQA) analyzes query ques-
tion and then finds the answer from the knowledge base. However,
KBQA also needs the support of reasoning techniques because the 7.2.5. Other applications
knowledge graph is incomplete. For example, Watson defeated hu- Knowledge reasoning techniques also play an important role
manity in Jeopardy, in which knowledge reasoning plays an im- in some other intelligent scenarios. For example, knowledge rea-
portant role. The questions of Jeopardy cover various areas and re- soning technology can be used to understand the user’s query
quire candidates to analyze and reason entailment, irony and rid- intent in search engines. In addition, it can be used for other
dles. Intelligent question-answering systems, such as Apple’s Siri, computational linguistics tasks such as plagiarism detection, senti-
Microsoft’s Cortana, and Amazon’s Alexa, all require the support of ment analysis, document categorization, spoken dialogue systems.
knowledge graph inference. Specifically, Franco-Salvador, Gupta, Rosso, and Banchs (2016a);
The development of knowledge reasoning technology has laid a Franco-Salvador, Rosso, and Montes-y Gómez (2016b) studied
technical foundation for the development of intelligent question- hybrid models that combine knowledge graph reasoning ap-
answering systems. For example, Jain (2016) present factual mem- proach and continuous representation methods for the task of
ory network, which answers questions by extracting and rea- cross-language plagiarism detection. Cambria, Olsher, and Ra-
soning over relevant facts from Freebase. It represents questions jagopal (2014) show that how use SenticNet 3 and COGBASE
and triples in the same vector space, generates candidate facts, to infer the polarity of a sentence. Franco-Salvador, Cruz, Troy-
then finds out the answer using multi-hop reasoning. Zhang, Dai, ano, and Rosso (2015) propose the use of meta-learning to com-
Kozareva, Smola, and Song (2018a) propose an end-to-end vari- bine and enrich current approaches by adding knowledge-based
ational reasoning network (VRN) for question answering with features obtained through inference to solve single and cross-
knowledge graph. VRN first recognizes the topic entity. Given the domain polarity classification tasks. Franco-Salvador, Rosso, and
topic entity, the answer to the question can be retrieved through Navigli (2014) leverage a multilingual knowledge graph, i.e., Ba-
multi-hop reasoning on the knowledge graph. Narasimhan, Lazeb- belNet, to obtain language-independent knowledge representation
nik, and Schwing (2018) ropose an algorithm based on graph con- for documents to solve two tasks: comparable document retrieval
volution net (GCN) (Kipf & Welling, 2016) for reasoning in visual and cross-language text categorization. Ma, Crook, Sarikaya, and
question answering. When answering questions, they combine the Fosler-Lussier (2015) propose Inference Knowledge Graph to form
visualized situation with general knowledge encoded in the form part of a spoken dialogue system. Wang et al. (2018b) propose a
of a knowledge base. However, there are still some problems to be graph reasoning model (GRM) to reason about the relationship of
solved in the intelligent question-answering systems. First, KBQA two persons from an image based on a social knowledge graph. As
mainly focuses on single-fact questions. Specifically, answering the deep neural networks are widely used in natural language process-
question requires only one triple in the KG. Meanwhile, for the ing tasks, knowledge inference will usher in broader prospects.
complex problems that require multi-step reasoning, for example,
when answering ”What’s the name of Yao Ming’s wife’s daugh-
ter?”, KBQA performs poorly. Recently, Zhang, Dai, Toraman, and 8. Discussion and research opportunities
Song (2018b) imitate human brain to solve the problem. Second,
current knowledge bases are composed of factual knowledge and With the development of KG, knowledge graph reasoning has
lack of common sense. However, common sense plays an impor- been widely explored and utilized in multiple knowledge-driven
tant role in the process of human brain reasoning, and common tasks, which significantly improves their performances. In this sec-
sense knowledge is difficult to standardize. Therefore, incorporat- tion, we first give a brief summary of these methods to identify the
ing common-sense knowledge into KBQA for reasoning is a key is- gap, and then propose research opportunities of knowledge graph
sue in intelligent question answering. reasoning.
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 17
Table 3
Summary of knowledge reasoning models.
8.1. Summary from 2009 to 2016. Therefore, it is quite conceivable that taking
temporal information into accountant during reasoning. Only a few
In this paper, we provide a broad overview of currently works address this problem, but their efforts are still preliminary
available techniques, including rule-based reasoning methods, and reasoning methods for dynamical knowledge graph still need
distributed representation-based reasoning methods, and neural to be further explored.
network-based reasoning methods. We give a summary of advan-
tages, disadvantages, representative works and applications of each
8.2.2. Zero-shot reasoning
type of models, which is shown in Table 3.
Existing knowledge graph reasoning models often require a
To sum up, there are differences and parallels between these
large number of high-quality samples for training and learning,
three classes of reasoning methods, and they are complimentary
while it would consume considerable time and manpower. Re-
in inference tasks. The relevance lies in the fact that all of them
cently, zero-shot learning has attracted much attention in many
abstract the knowledge graph into topology and then use the topo-
fields such as computer vision, natural language processing and so
logical relations between entities to model features and learn pa-
on. Zero-shot learning can learn from an unseen class or a class
rameters. The main difference is that knowledge inference mod-
with only a few instances. In the reasoning process, the practical
els based on neural network integrate CNN or RNN into the repre-
problem is that a large number of training samples cannot be ob-
sentation learning model or the logic rule model, extract features
tained, resulting in many knowledge reasoning models being in-
through the self-learning ability of deep learning model, and then
effective. It is natural that additional information such as text de-
utilize its memory reasoning ability to establish an entity relation
scription and multi-modal information can help to deal with the
prediction model. The representation learning model projects en-
zero-shot scenario. Besides, it’s necessary to design a new frame-
tities and relations into a low-dimensional vector space and per-
work which is more suitable for reasoning entities out of KGs.
forms reasoning based on semantic expression. The advantage is
that the structural information in KG can be fully utilized when
generating knowledge representation vectors. The disadvantage is 8.2.3. Multi-source information reasoning
that prior knowledge cannot be introduced to achieve inference With the rapid development of mobile communication technol-
when modelling. The logic rule model uses the abstract or concrete ogy, people can upload and share multimedia contents including
Horn clause for reasoning model, which is essentially rule-based text, audio, images, and videos on the Web anytime. How to effi-
reasoning. Its advantage is that it can simulate human logical rea- ciently and effectively utilize these rich information is becoming a
soning behaviour, and introduce human prior knowledge to assist critical and challenging problem. And multi-source information has
in reasoning. The disadvantage is that it has not solved a series of shown its potential to help reason over KGs while existing meth-
problems, including dependence on domain experts, high compu- ods of utilizing such information are still preliminary. We could
tational complexity, and poor generalization ability. design more effective and elegant models to utilize these kinds of
information better.
8.2. Research opportunities
8.2.4. Multi-lingual knowledge graph reasoning
Although existing models have already shown their powers in There are many KGs, such as Freebase, DBpedia have con-
reasoning over KGs, there are still many possible improvements of structed multilingual versions by extracting structured information
them to be explored of. In this section, we will discuss the chal- from Wikipedia. Multilingual KGs play important roles in many ap-
lenges of knowledge graph reasoning and give potential research plications such as machine translation, cross-lingual plagiarism de-
opportunities. tection, and information extraction. However, to the best of our
knowledge, only a few works have been done for reasoning over
8.2.1. Dynamical knowledge reasoning multilingual KGs. For example, (Abouenour, Nasri, Bouzoubaa, Kab-
Existing knowledge graph reasoning approaches mainly focus baj, & Rosso, 2014) construct an Arabic question-answering system
on static multi-relational data but neglect the useful time informa- to support semantic reasoning, and (Chen, Tian, Chang, Skiena, &
tion contained in knowledge graphs. However, knowledge is not Zaniolo, 2018a) present a cross-lingual inference method for KG
static and will evolve with time. We note that KG facts are not completion based on French and German KG. Therefore, multi-
universally true, as they tend to be valid only in a specific time lingual knowledge graph reasoning is also a significative but chal-
scope. For instance, (BarackObama, PresidentOf, USA) was true only lenging work to be studied.
18 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
9. Conclusions Akbik, A., & Löser, A. (2012). Kraken: N-ary facts in open information extraction.
In Proceedings of the joint workshop on automatic knowledge base construction
and web-scale knowledge extraction (pp. 52–56). Association for Computational
KG reasoning, which aims to infer new knowledge from existing Linguistics.
triplets, has played an important role in many tasks and attracted Annervaz, K., Chowdhury, S. B. R., & Dukkipati, A. (2018). Learning beyond datasets:
much attention. In this paper, we give a broad overview of exist- Knowledge graph augmented neural networks for natural language processing.
In Proceedings of the 2018 conference of the North American chapter of the associ-
ing approaches with a particular focus on three types of reason- ation for computational linguistics: Human language technologies, volume 1 (long
ing methods, i.e., rule-based methods, distributed representation- papers) (pp. 313–322).
based methods and neural network-based methods. Methods that Bach, S. H., Broecheler, M., Huang, B., & Getoor, L. (2017). Hinge-loss Markov
random fields and probabilistic soft logic. Journal of Machine Learning Research,
conduct reasoning using logic rules were first introduced. We de-
18, 1–67.
scribed the model details as well as advantages and disadvantages Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., & Etzioni, O. (2007). Open
of such methods. After that, we discuss some more advanced ap- information extraction from the web.. In IJCAI: 7 (pp. 2670–2676).
Bienvenu, M., Bourgaux, C., & Goasdoué, F. (2019). Computing and explaining query
proaches that perform KG reasoning with other information. The
answers over inconsistent dl-lite knowledge bases. Journal of Artificial Intelli-
investigation on using reinforcement learning has just started and gence Research, 64, 563–644.
might receive increasing attention in the near future. Finally, we Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: a col-
discuss the remaining challenges of knowledge graph reasoning laboratively created graph database for structuring human knowledge. In Pro-
ceedings of the 2008 ACM sigmod international conference on management of data
and its application, and then give an oulook of the further study (pp. 1247–1250). ACM.
of knowledge graph reasoning. We hope that this review will pro- Bordes, A., Glorot, X., Weston, J., & Bengio, Y. (2012). Joint learning of words and
vide new insights for further study of KG reasoning. meaning representations for open-text semantic parsing. In Artificial intelligence
and statistics (pp. 127–135).
Bordes, A., Glorot, X., Weston, J., & Bengio, Y. (2014). A semantic matching en-
Declaration of Competing Interest ergy function for learning with multi-relational data. Machine Learning, 94(2),
233–259.
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., & Yakhnenko, O. (2013). Trans-
We wish to confirm that there are no known conflicts of inter- lating embeddings for modeling multi-relational data. In Advances in neural in-
est associated with this publication and there has been no signifi- formation processing systems (pp. 2787–2795).
Bordes, A., Weston, J., Collobert, R., & Bengio, Y. (2011). Learning structured embed-
cant financial support for this work that could have influenced its
dings of knowledge bases. Twenty-fifth AAAI conference on artificial intelligence.
outcome. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., & Rosati, R. (2006). Epis-
We confirm that the manuscript has been read and approved by temic first-order queries over description logic knowledge bases. In 2006 inter-
all named authors and that there are no other persons who satis- national workshop on description logics dl’06 (p. 51). Citeseer.
Cambria, E., Olsher, D., & Rajagopal, D. (2014). Senticnet 3: a common and com-
fied the criteria for authorship but are not listed. We further con- mon-sense knowledge base for cognition-driven sentiment analysis. Twen-
firm that the order of authors listed in the manuscript has been ty-eighth aaai conference on artificial intelligence.
approved by all of us. Catherine, R., & Cohen, W. (2016). Personalized recommendations using knowledge
graphs: A probabilistic logic programming approach. In Proceedings of the 10th
We confirm that we have given due consideration to the pro- ACM conference on recommender systems (pp. 325–332). ACM.
tection of intellectual property associated with this work and that Chang, K.-W., Yih, W.-t., Yang, B., & Meek, C. (2014). Typed tensor decomposition of
there are no impediments to publication, including the timing of knowledge bases for relation extraction. In Proceedings of the 2014 conference on
empirical methods in natural language processing (EMNLP) (pp. 1568–1579).
publication, with respect to intellectual property. In so doing we Chekol, M. W., Pirrò, G., Schoenfisch, J., & Stuckenschmidt, H. (2017). Marrying un-
confirm that we have followed the regulations of our institutions certainty and time in knowledge graphs. Thirty-first AAAI conference on artificial
concerning intellectual property. intelligence.
Chen, D., Socher, R., Manning, C. D., & Ng, A. Y. (2013a). Learning new facts from
We understand that the Corresponding Author is the sole con-
knowledge bases with neural tensor networks and semantic word vectors. In-
tact for the Editorial process (including Editorial Manager and di- ternational conference on learning representations (ICLR).
rect communications with the office). He is responsible for com- Chen, M., Tian, Y., Chang, K.-W., Skiena, S., & Zaniolo, C. (2018a). Co-training em-
beddings of knowledge graphs and entity descriptions for cross-lingual entity
municating with the other authors about progress, submissions of
alignment. In Proceedings of the 27th international joint conference on artificial
revisions and final approval of proofs. We confirm that we have intelligence (pp. 3998–4004). AAAI Press.
provided a current, correct email address which is accessible by Chen, W., Xiong, W., Yan, X., & Wang, W. Y. (2018b). Variational knowledge graph
the Corresponding Author and which has been configured to ac- reasoning. In Proceedings of NAACL-HLT (pp. 1823–1832).
Chen, X., Chen, H., Zhang, N., Chen, J., & Wu, Z. (2013b). Owl reasoning over big
cept email from [email protected]. biomedical data. In 2013 IEEE international conference on big data (pp. 29–36).
IEEE.
Chen, X., Chen, M., Shi, W., Sun, Y., & Zaniolo, C. (2019). Embedding uncertain
Credit authorship contribution statement knowledge graphs. In Proceedings of the thirty-third AAAIconference on artificial
intelligence (AAAI).
Xiaojun Chen: Conceptualization, Formal analysis, Investigation, Chen, Y., Goldberg, S., Wang, D. Z., & Johri, S. S. (2016). Ontological pathfind-
ing. In Proceedings of the 2016 international conference on management of data
Methodology, Writing - original draft, Writing - review & editing.
(pp. 835–846). ACM.
Shengbin Jia: Formal analysis, Supervision. Yang Xiang: Funding Chen, Y., & Wang, D. Z. (2014). Knowledge expansion over probabilistic knowledge
acquisition, Supervision. bases. In Proceedings of the 2014 ACM sigmod international conference on man-
agement of data (pp. 649–660). ACM.
Cohen, W. W. (2016). Tensorlog: A differentiable deductive database. arXiv:1605.
Acknowledgements 06523
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011).
Natural language processing (almost) from scratch. Journal of Machine Learning
We would like to thank all the reviewers for their insightful Research, 12(Aug), 2493–2537.
and valuable comments, which significantly improve the quality of Das, R., Dhuliawala, S., Zaheer, M., Vilnis, L., Durugkar, I., Krishnamurthy, A.,
this paper. This work is supported by the National Natural Science et al. (2018). Go for a walk and arrive at the answer: Reasoning over paths in
knowledge bases using reinforcement learning. ICLR.
Foundation of China under Grant no. 71571136 and the Project of Das, R., Neelakantan, A., Belanger, D., & McCallum, A. (2017). Chains of reasoning
Science and Technology Commission of Shanghai Municipality un- over entities, relations, and text using recurrent neural networks. In Proceedings
der Grant no. 16JC14030 0 0. of the 15th conference of the European chapter of the association for computational
linguistics: Volume 1, long papers (pp. 132–141).
Dasgupta, S. S., Ray, S. N., & Talukdar, P. (2018). Hyte: Hyperplane-based tempo-
References rally aware knowledge graph embedding. In Proceedings of the 2018 conference
on empirical methods in natural language processing (pp. 2001–2011).
Abouenour, L., Nasri, M., Bouzoubaa, K., Kabbaj, A., & Rosso, P. (2014). Construc- Demeester, T., Rocktäschel, T., & Riedel, S. (2016a). Lifted rule injection for relation
tion of an ontology for intelligent Arabic QAsystems leveraging the conceptual embeddings. In Proceedings of the 2016 conference on empirical methods in natu-
graphs representation. Journal of Intelligent & Fuzzy Systems, 27(6), 2869–2881. ral language processing (pp. 1389–1399).
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 19
Demeester, T., Rocktäschel, T., & Riedel, S. (2016b). Regularizing relation representa- Jenatton, R., Roux, N. L., Bordes, A., & Obozinski, G. R. (2012). A latent factor model
tions by first-order implications. In Akbc2016, the workshop on automated base for highly multi-relational data. In Advances in neural information processing sys-
construction (pp. 1–6). tems (pp. 3167–3175).
Dettmers, T., Minervini, P., Stenetorp, P., & Riedel, S. (2018). Convolutional 2d knowl- Ji, G., He, S., Xu, L., Liu, K., & Zhao, J. (2015). Knowledge graph embedding via dy-
edge graph embeddings. Thirty-second AAAI conference on artificial intelligence. namic mapping matrix. In Proceedings of the 53rd annual meeting of the asso-
Ding, X., Zhang, Y., Liu, T., & Duan, J. (2016). Knowledge-driven event embedding for ciation for computational linguistics and the 7th international joint conference on
stock prediction. In Proceedings of coling 2016, the 26th international conference natural language processing (volume 1: Long papers): 1 (pp. 687–696).
on computational linguistics: Technical papers (pp. 2133–2142). Ji, G., Liu, K., He, S., & Zhao, J. (2016). Knowledge graph completion with adaptive
Dong, X. L., Gabrilovich, E., Heitz, G., Horn, W., Murphy, K., Sun, S., et al. (2014). sparse transfer matrix. Thirtieth AAAI conference on artificial intelligence.
From data fusion to knowledge fusion. Proceedings of the VLDB Endowment, Jiang, S., Lowd, D., & Dou, D. (2012). Learning to refine an automatically extracted
7(10), 881–892. knowledge base using Markov logic. In 2012 IEEE 12th international conference
Fader, A., Soderland, S., & Etzioni, O. (2011). Identifying relations for open informa- on data mining (pp. 912–917). IEEE.
tion extraction. In Proceedings of the conference on empirical methods in natural Jiang, T., Liu, T., Ge, T., Sha, L., Chang, B., Li, S., et al. (2016a). Towards time-aware
language processing (pp. 1535–1545). Association for Computational Linguistics. knowledge graph completion. In Proceedings of coling 2016, the 26th international
Fan, M., Zhou, Q., Chang, E., & Zheng, T. F. (2014). Transition-based knowledge graph conference on computational linguistics: Technical papers (pp. 1715–1724).
embedding with relational mapping properties. In Proceedings of the 28th Pacific Jiang, T., Liu, T., Ge, T., Sha, L., Li, S., Chang, B., et al. (2016b). Encoding temporal
Asia conference on language, information and computing. information for time-aware link prediction. In Proceedings of the 2016 conference
Fang, Y., Zhao, X., Tan, Z., Yang, S., & Xiao, W. (2018). A revised translation-based on empirical methods in natural language processing (pp. 2350–2354).
method for knowledge graph representation. Journal of Computer Research and Kapetanakis, S., Samakovitis, G., Gunasekera, P. B. D., & Petridis, M. (2012). Monitor-
Development, 55(1), 139–150. ing financial transaction fraud with the use of case-based reasoning.
Franco-Salvador, M., Cruz, F. L., Troyano, J. A., & Rosso, P. (2015). Cross-domain polar- Kertkeidkachorn, N., & Ichise, R. (2017). T2kg: An end-to-end system for creating
ity classification using a knowledge-enhanced meta-classifier. Knowledge-Based knowledge graph from unstructured text. Workshops at the thirty-first AAAI con-
Systems, 86, 46–56. ference on artificial intelligence.
Franco-Salvador, M., Gupta, P., Rosso, P., & Banchs, R. E. (2016a). Cross-language pla- Kimmig, A., Bach, S. H., Broecheler, M., Huang, B., & Getoor, L. (2012). A short in-
giarism detection over continuous-space-and knowledge graph-based represen- troduction to probabilistic soft logic. In Nips workshop on probabilistic program-
tations of language. Knowledge-based Systems, 111, 87–99. ming: Foundations and applications: 1 (p. 3).
Franco-Salvador, M., Rosso, P., & Montes-y Gómez, M. (2016b). A systematic study of Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. Stat, 1050, 1.
knowledge graph analysis for cross-language plagiarism detection. Information Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolu-
Processing & Management, 52(4), 550–570. tional networks. arXiv:1609.02907
Franco-Salvador, M., Rosso, P., & Navigli, R. (2014). A knowledge-based representa- Kompridis, N. (20 0 0). So we need something else for reason to mean. International
tion for cross-language document retrieval and categorization. In Proceedings of Journal of Philosophical Studies, 8(3), 271–295.
the 14th conference of the European chapter of the association for computational Krötzsch, M., Marx, M., Ozaki, A., & Thost, V. (2018). Attributed description logics:
linguistics (pp. 414–423). Reasoning on knowledge graphs.. In IJCAI (pp. 5309–5313).
Galárraga, L., Teflioudi, C., Hose, K., & Suchanek, F. M. (2015). Fast rule mining in on- Kumar, K. A., Singh, Y., & Sanyal, S. (2009). Hybrid approach using case-based rea-
tological knowledge bases with AMIE+. The VLDB Journal-The International Jour- soning and rule-based reasoning for domain independent clinical decision sup-
nal on Very Large Data Bases, 24(6), 707–730. port in ICU. Expert Systems with Applications, 36(1), 65–71.
Galárraga, L. A., Teflioudi, C., Hose, K., & Suchanek, F. (2013). Amie: Association rule Kuželka, O., & Davis, J. (2019). Markov logic networks for knowledge base comple-
mining under incomplete evidence in ontological knowledge bases. In Proceed- tion: A theoretical analysis under the MCAR assumption. In Proceedings of the
ings of the 22nd international conference on world wide web (pp. 413–422). ACM. thirty-fifth conference on uncertainty in artificial intelligence, UAI: 2019.
García-Crespo, Á., Rodríguez, A., Mencke, M., Gómez-Berbís, J. M., & Colomo-Pala- Landwehr, N., Kersting, K., & Raedt, L. D. (2007). Integrating naive Bayes and foil.
cios, R. (2010). Oddin: Ontology-driven differential diagnosis based on logical Journal of Machine Learning Research, 8(Mar), 481–507.
inference and probabilistic refinements. Expert Systems with Applications, 37(3), Landwehr, N., Passerini, A., De Raedt, L., & Frasconi, P. (2010). Fast learning of rela-
2621–2628. tional kernels. Machine Learning, 78(3), 305–342.
García-Durán, A., Bordes, A., & Usunier, N. (2015). Composing relationships with Lao, N., & Cohen, W. W. (2010). Relational retrieval using a combination of path–
translations. In Proceedings of the 2015 conference on empirical methods in nat- constrained random walks. Machine Learning, 81(1), 53–67.
ural language processing (pp. 286–290). Lao, N., Mitchell, T., & Cohen, W. W. (2011). Random walk inference and learning in
Gardner, M., & Mitchell, T. (2015). Efficient and expressive knowledge base comple- a large scale knowledge base. In Proceedings of the conference on empirical meth-
tion using subgraph feature extraction. In Proceedings of the 2015 conference on ods in natural language processing (pp. 529–539). Association for Computational
empirical methods in natural language processing (pp. 1488–1498). Linguistics.
Gardner, M., Talukdar, P., Krishnamurthy, J., & Mitchell, T. (2014). Incorporating vec- Lao, N., Subramanya, A., Pereira, F., & Cohen, W. W. (2012). Reading the web with
tor space similarity in random walk inference over knowledge bases. In Proceed- learned syntactic-semantic inference rules. In Proceedings of the 2012 joint con-
ings of the 2014 conference on empirical methods in natural language processing ference on empirical methods in natural language processing and computational
(EMNLP) (pp. 397–406). natural language learning (pp. 1017–1026). Association for Computational Lin-
Godin, F., Kumar, A., & Mittal, A. (2018). Using ternary rewards to reason over guistics.
knowledge graphs with deep reinforcement learning. NIPS workshop on rela- Leblay, J., & Chekol, M. W. (2018). Deriving validity time in knowledge graph.
tional representation learning. In Companion of the web conference 2018 on the web conference 2018
Guo, L., Zhang, Q., Ge, W., Hu, W., & Qu, Y. (2018). DSKG: A deep sequential model (pp. 1771–1776). International World Wide Web Conferences Steering Commit-
for knowledge graph completion. In China conference on knowledge graph and tee.
semantic computing (pp. 65–77). Springer. Lee, T.-W., Lewicki, M. S., Girolami, M., & Sejnowski, T. J. (1999). Blind source sepa-
Guo, S., Wang, Q., Wang, L., Wang, B., & Guo, L. (2016). Jointly embedding knowl- ration of more sources than mixtures using overcomplete representations. IEEE
edge graphs and logical rules. In Proceedings of the 2016 conference on empirical Signal Processing letters, 6(4), 87–90.
methods in natural language processing (pp. 192–202). Lenat, D. B., & Guha, R. V. (1989). Building large knowledge-based systems; represen-
Hájek, P. (2013). Metamathematics of fuzzy logic: 4. Springer Science & Business Me- tation and inference in the Cyc project. Addison-Wesley Longman Publishing Co.,
dia. Inc..
Halaschek-Wiener, C., Parsia, B., Sirin, E., & Kalyanpur, A. (2006). Description logic Li, Y., Xu, B., Lu, J., & Kang, D. (2006). Discrete tableau algorithms for FSHI. In Pro-
reasoning for dynamic aboxes. In 2006 international workshop on description log- ceedings of the international workshop on description logics (dl).
ics dl’06 (p. 200). Li, Z., Jin, X., Guan, S., Wang, Y., & Cheng, X. (2018). Path reasoning over knowledge
He, S., Liu, K., Ji, G., & Zhao, J. (2015a). Learning to represent knowledge graphs with graph: A multi-agent and reinforcement learning based method. In 2018 ieee
gaussian embedding. In Proceedings of the 24th ACM international on conference international conference on data mining workshops (ICDMW) (pp. 929–936). IEEE.
on information and knowledge management (pp. 623–632). ACM. Lin, X. V., Socher, R., & Xiong, C. (2018). Multi-hop knowledge graph reasoning with
He, W., Feng, Y., Zou, L., & Zhao, D. (2015b). Knowledge base completion using ma- reward shaping. In Proceedings of the 2018 conference on empirical methods in
trix factorization. In Asia-pacific web conference (pp. 256–267). Springer. natural language processing (pp. 3243–3253).
He, Y., Ni, L., Cao, L., & Ma, C. (2016). Research on case based reasoning system of Lin, Y., Liu, Z., Luan, H., Sun, M., Rao, S., & Liu, S. (2015a). Modeling relation paths
stock theme events based on ontology. Computer Technology and Development, for representation learning of knowledge bases. In Proceedings of the 2015 con-
(1), 33–38. ference on empirical methods in natural language processing (pp. 705–714).
Ho, V. T., Stepanova, D., Gad-Elrab, M. H., Kharlamov, E., & Weikum, G. (2018). Rule Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015b). Learning entity and relation embed-
learning from knowledge graphs guided by embedding models. In International dings for knowledge graph completion. Twenty-ninth AAAI conference on artificial
semantic web conference (pp. 72–90). Springer. intelligence.
Jain, P., Murty, S., Chakrabarti, S. et al. (2017). Joint matrix-tensor factorization for Liu, H., Wu, Y., & Yang, Y. (2017a). Analogical inference for multi-relational embed-
knowledge base inference. arXiv:1706.00637 dings. In Proceedings of the 34th international conference on machine learning-vol-
Jain, S. (2016). Question answering over knowledge base using factual memory ume 70 (pp. 2168–2178). JMLR. org.
networks. In Proceedings of the NAACCL student research workshop (pp. 109– Liu, Q., Han, M., Jiang, L., Liu, Y., & Geng, J. (2017b). Two-tier random walk based
115). relational inference algorithm. Chineses Journal of Computers, 40(6), 1275–1290.
Jang, S., & Megawati, J. C. (2015). Semi-automatic quality assessment of linked data Liu, Q., Han, M., Yang, X., Liu, Y., & Wu, Z. (2017c). Representation learning based re-
without requiring ontology.. In Proceedings of the international semantic web con- lational inference algorithm with semantical aspect awareness. Journal of Com-
ference (ISWC) (pp. 45–55). Springer. puter Reseach and Develoment, 54(8), 1682–1692.
20 X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948
Liu, Q., Jiang, H., Evdokimov, A., Ling, Z.-H., Zhu, X., Wei, S., et al. (2016a). Probabilis- methods in natural language processing (pp. 1088–1098). Association for Compu-
tic reasoning via deep learning: Neural association models. arXiv:1603.07704 tational Linguistics.
Liu, Y., Zeng, Q., Yang, H., & Carrio, A. (2018). Stock price movement prediction from Shen, Y., Huang, P.-S., Chang, M.-W., & Gao, J. (2016). Traversing knowledge graph in
financial news with deep learning and knowledge graph embedding. In Pacific vector space without symbolic space guidance. arXiv:1611.04642
rim knowledge acquisition workshop (pp. 102–113). Springer. Shi, B., & Weninger, T. (2017a). Open-world knowledge graph completion. arXiv:1711.
Liu, Z., Sun, M., Lin, Y., & Xie, R. (2016b). Knowledge representation learning: A re- 03438
view. Journal of Computer Reseach and Develoment, 53(2), 247–261. Shi, B., & Weninger, T. (2017b). Proje: Embedding projection for knowledge graph
Ma, Y., Crook, P. A., Sarikaya, R., & Fosler-Lussier, E. (2015). Knowledge graph infer- completion. Thirty-first aaai conference on artificial intelligence.
ence for spoken dialog systems. In 2015 ieee international conference on acoustics, Singhal, A. (2012). Introducing the knowledge graph: things, not strings. Official
speech and signal processing (icassp) (pp. 5346–5350). IEEE. Google Blog, 5.
Martínez-Romero, M., Vázquez-Naya, J. M., Pereira, J., Pereira, M., Pazos, A., & Sirin, E., Parsia, B., Grau, B. C., Kalyanpur, A., & Katz, Y. (2007). Pellet: A practical
Baños, G. (2013). The iosc3 system: using ontologies and swrl rules for intel- owl-dl reasoner. Web Semantics: science, services and agents on the World Wide
ligent supervision and care of patients with acute cardiac disorders. Computa- Web, 5(2), 51–53.
tional and mathematical methods in medicine, 2013. Socher, R., Chen, D., Manning, C. D., & Ng, A. (2013). Reasoning with neural ten-
Marx, M., Krötzsch, M., & Thost, V. (2017). Logic on mars: ontologies for generalised sor networks for knowledge base completion. In Advances in neural information
property graphs. In Proceedings of the 26th international joint conference on arti- processing systems (pp. 926–934).
ficial intelligence (pp. 1188–1194). AAAI Press. Soma, R., & Prasanna, V. K. (2008). Parallel inferencing for owl knowledge bases. In
McCarthy, J. (1980). Circumscription-a form of non-monotonic reasoning. Artificial 2008 37th international conference on parallel processing (pp. 75–82). IEEE.
intelligence, 13(1–2), 27–39. Stoilos, G., Stamou, G., Pan, J. Z., Tzouvaras, V., & Horrocks, I. (2007). Reasoning with
Mikolov, T., Chen, K., Corrado, G., & Dean, J. Efficient estimation of word represen- very expressive fuzzy description logics. Journal of Artificial Intelligence Research,
tations in vector space. arXiv:1301.3781 30, 273–320.
Miller, G. A. (1995). Wordnet: a lexical database for english. Communications of the Suchanek, F. M., Kasneci, G., & Weikum, G. (2007). Yago: a core of semantic knowl-
ACM, 38(11), 39–41. edge. In Proceedings of the 16th international conference on world wide web
Minsky, M. (1988). A framework for representing knowledge. Readings in Cognitive (pp. 697–706). ACM.
Science, 20(3), 156–189. Suchanek, F. M., Kasneci, G., & Weikum, G. (2008). Yago: A large ontology from
Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Betteridge, J., Carlson, A., wikipedia and wordnet. Web Semantics: Science, Services and Agents on the World
et al. (2015). Never-ending learning. In Proceedings of the twenty-ninth aaai con- Wide Web, 6(3), 203–217.
ference on artificial intelligence (pp. 2302–2310). AAAI Press. Tang, X., Chen, L., Cui, J., & Wei, B. (2019). Knowledge representation learning with
Nakashole, N., Sozio, M., Suchanek, F. M., & Theobald, M. (2012). Query-time rea- entity descriptions, hierarchical types, and textual relations. Information Process-
soning in uncertain RDF knowledge bases with soft and hard rules.. VLDS, 884, ing & Management, 56(3), 809–822.
15–20. Tari, L. (2013). Knowledge inference. Encyclopedia of Systems Biology, 1074–1078.
Narasimhan, M., Lazebnik, S., & Schwing, A. (2018). Out of the box: Reasoning with Tay, Y., Tuan, L. A., Phan, M. C., & Hui, S. C. (2017). Multi-task neural network for
graph convolution nets for factual visual question answering. In Advances in non-discrete attribute prediction in knowledge graphs. In Proceedings of the 2017
neural information processing systems (pp. 2654–2665). acm on conference on information and knowledge management (pp. 1029–1038).
Neelakantan, A., Roth, B., & McCallum, A. (2015). Compositional vector space models ACM.
for knowledge base inference. 2015 aaai spring symposium series. Toutanova, K., Lin, V., Yih, W.-t., Poon, H., & Quirk, C. (2016). Compositional learning
Nickel, M., Murphy, K., Tresp, V., & Gabrilovich, E. (2016a). A review of relational of embeddings for relation paths in knowledge base and text. In Proceedings of
machine learning for knowledge graphs. Proceedings of the IEEE, 104(1), 11–33. the 54th annual meeting of the association for computational linguistics (volume 1:
Nickel, M., Rosasco, L., & Poggio, T. (2016b). Holographic embeddings of knowledge Long papers): 1 (pp. 1434–1444).
graphs. Thirtieth aaai conference on artificial intelligence. Trivedi, R., Dai, H., Wang, Y., & Song, L. (2017). Know-evolve: Deep temporal rea-
Nickel, M., & Tresp, V. (2013). Logistic tensor factorization for multi-relational data. soning for dynamic knowledge graphs. In Proceedings of the 34th international
In Proceedings of the 30th international conference on machine learning-volume conference on machine learning-volume 70 (pp. 3462–3471). JMLR. org.
28. JMLR. org. Trouillon, T., Dance, C. R., Gaussier, É., Welbl, J., Riedel, S., & Bouchard, G. (2017).
Nickel, M., Tresp, V., & Kriegel, H.-P. (2011). A three-way model for collective learn- Knowledge graph completion via complex tensor factorization. The Journal of
ing on multi-relational data.. In Icml: 11 (pp. 809–816). Machine Learning Research, 18(1), 4735–4772.
Nickel, M., Tresp, V., & Kriegel, H.-P. (2012). Factorizing yago: scalable machine Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., & Bouchard, G. (2016). Complex em-
learning for linked data. In Proceedings of the 21st international conference on beddings for simple link prediction. In International conference on machine learn-
world wide web (pp. 271–280). ACM. ing (pp. 2071–2080).
Norenzayan, A., Smith, E. E., Kim, B. J., & Nisbett, R. E. (2002). Cultural preferences Wang, Q., Liu, J., Luo, Y., Wang, B., & Lin, C.-Y. (2016). Knowledge base completion via
for formal versus intuitive reasoning. Cognitive science, 26(5), 653–684. coupled path ranking. In Proceedings of the 54th annual meeting of the association
Paulheim, H. (2017). Knowledge graph refinement: A survey of approaches and eval- for computational linguistics (volume 1: Long papers): 1 (pp. 1308–1318).
uation methods. Semantic web, 8(3), 489–508. Wang, W. Y., & Cohen, W. W. (2016). Learning first-order logic embeddings via ma-
Paulheim, H., & Bizer, C. (2014). Improving the quality of linked data using statisti- trix factorization.. In Ijcai (pp. 2132–2138).
cal distributions. International Journal on Semantic Web and Information Systems Wang, W. Y., Mazaitis, K., & Cohen, W. W. (2013). Programming with personalized
(IJSWIS), 10(2), 63–86. pagerank: a locally groundable first-order probabilistic logic. In Proceedings of
Pellissier Tanon, T., Vrandečić, D., Schaffert, S., Steiner, T., & Pintscher, L. (2016). From the 22nd acm international conference on information & knowledge management
freebase to wikidata: The great migration. In Proceedings of the 25th interna- (pp. 2129–2138). ACM.
tional conference on world wide web (pp. 1419–1428). International World Wide Wang, W. Y., Mazaitis, K., & Cohen, W. W. (2014a). Structure learning via parameter
Web Conferences Steering Committee. learning. In Proceedings of the 23rd acm international conference on conference on
Pujara, J., Miao, H., Getoor, L., & Cohen, W. (2013a). Knowledge graph identification. information and knowledge management (pp. 1199–1208). ACM.
In International semantic web conference (pp. 542–557). Springer. Wang, W. Y., Mazaitis, K., Lao, N., & Cohen, W. W. (2015). Efficient inference and
Pujara, J., Miao, H., Getoor, L., & Cohen, W. W. (2013b). Ontology-aware partition- learning in a large knowledge base. Machine Learning, 100(1), 101–126.
ing for knowledge graph identification. In Proceedings of the 2013 workshop on Wang, X., Wang, D., Xu, C., He, X., Cao, Y., & Chua, T.-S. (2018a). Explainable reason-
automated knowledge base construction (pp. 19–24). ACM. ing over knowledge graphs for recommendation. arXiv:1811.04540
Qu, M., & Tang, J. (2019). Probabilistic logic neural networks for reasoning. Wang, Z., Chen, T., Ren, J., Yu, W., Cheng, H., & Lin, L. (2018b). Deep reasoning
arXiv:1906.08495 with knowledge graph for social relationship understanding. In Proceedings of
Ravishankar, S., Talukdar, P. P. et al. (2017). Revisiting simple neural networks for the 27th international joint conference on artificial intelligence (pp. 1021–1028).
learning representations of knowledge graphs. arXiv:1711.05401 AAAI Press.
Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine learning, Wang, Z., & Li, J. (2015). Rdf2rules: learning rules from rdf knowledge bases by min-
62(1–2), 107–136. ing frequent predicate cycles. arXiv:1512.07734
Rocktäschel, T., Bošnjak, M., Singh, S., & Riedel, S. (2014). Low-dimensional em- Wang, Z., & Li, J.-Z. (2016). Text-enhanced representation learning for knowledge
beddings of logic. In Proceedings of the acl 2014 workshop on semantic parsing graph.. In Ijcai (pp. 1293–1299).
(pp. 45–49). Wang, Z., Li, L., Zeng, D. D., & Chen, Y. (2018c). Attention-based multi-hop reason-
Rocktäschel, T., Singh, S., & Riedel, S. (2015). Injecting logical background knowledge ing for knowledge graph. In 2018 ieee international conference on intelligence and
into embeddings for relation extraction. In Proceedings of the 2015 conference of security informatics (isi) (pp. 211–213). IEEE.
the north american chapter of the association for computational linguistics: Human Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014b). Knowledge graph and text jointly
language technologies (pp. 1119–1129). embedding. In Proceedings of the 2014 conference on empirical methods in natural
Ruan, T., Sun, C., Wang, H., Fang, Z., & Yin, Y. (2016). Construction of traditional language processing (emnlp) (pp. 1591–1601).
chinese medicine knowledge graph and its application. Journal of Medical Infor- Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014c). Knowledge graph embedding by
matics, 37(4), 8–13. translating on hyperplanes. Twenty-eighth aaai conference on artificial intelli-
Schlichtkrull, M., Kipf, T. N., Bloem, P., Van Den Berg, R., Titov, I., & gence.
Welling, M. (2018). Modeling relational data with graph convolutional networks. Wei, Y., Luo, J., & Xie, H. (2016a). Kgrl: an owl2 rl reasoning system for large scale
In European semantic web conference (pp. 593–607). Springer. knowledge graph. In 2016 12th international conference on semantics, knowledge
Schoenmackers, S., Etzioni, O., Weld, D. S., & Davis, J. (2010). Learning first-order and grids (skg) (pp. 83–89). IEEE.
horn clauses from web text. In Proceedings of the 2010 conference on empirical
X. Chen, S. Jia and Y. Xiang / Expert Systems With Applications 141 (2020) 112948 21
Wei, Z., Zhao, J., & Liu, K. (2016b). Mining inference formulas by goal-directed ran- Xu, Z., Sheng, Y., He, L., & Wang, Y. (2016). Review on knowledge graph techniques.
dom walks. In Proceedings of the 2016 conference on empirical methods in natural Journal of University of Electronic Science and Technology of China, 45(4), 589–606.
language processing (pp. 1379–1388). Yang, B., Yih, W.-t., He, X., Gao, J., & Deng, L. (2015). Embedding entities and rela-
Wen, J., Li, J., Mao, Y., Chen, S., & Zhang, R. (2016). On the representation and em- tions for learning and inference in knowledge bases. 3rd international conference
bedding of knowledge bases beyond binary relations. In Proceedings of the twen- on learning representations, ICLR 2015, san diego, ca, usa, may 7–9, 2015, confer-
ty-fifth international joint conference on artificial intelligence (pp. 1300–1307). ence track proceedings.
AAAI Press. Yang, F., Yang, Z., & Cohen, W. W. (2017). Differentiable learning of logical rules for
West, R., Gabrilovich, E., Murphy, K., Sun, S., Gupta, R., & Lin, D. (2014). Knowledge knowledge base reasoning. In Advances in neural information processing systems
base completion via search-based question answering. In Proceedings of the 23rd (pp. 2319–2328).
international conference on world wide web (pp. 515–526). ACM. Yuan, K., Deng, Y., Chen, D., Zhang, B., Lei, K., & Shen, Y. (2018). Construction tech-
Wu, F., Han, Y., Li, x., Zheng, Q., & Chen, X. (2018). Reasoning in artificial intelli- niques and research development of medical knowledge graph. Application Re-
gence:advances and challenges. Bulletin of National Natural Science Foundation of search of Computers, 35(7), 1–12.
China, 32(3), 262–265. Zadeh, L. A. (1965). Fuzzy sets. Information and control, 8(3), 338–353.
Wu, F., & Weld, D. S. (2010). Open information extraction using wikipedia. In Pro- Zhang, B., & Zhang, L. (1992). Theory and applications of problem solving: 9.
ceedings of the 48th annual meeting of the association for computational linguistics North-Holland Amsterdam.
(pp. 118–127). Association for Computational Linguistics. Zhang, N., Deng, S., Sun, Z., Wang, G., Chen, X., Zhang, W., et al. (2019a). Long–
Wu, Y., Zhu, D., Liao, X., Zhang, D., & Lin, K. (2017). Knowledge graph reasoning tail relation extraction via knowledge graph embeddings and graph convolution
based on paths of tensor factorization. Pattern Recognition and Artificial Intelli- networks. In Proceedings of the 2019 conference of the north american chapter of
gence, 30(5), 473–480. the association for computational linguistics: Human language technologies, volume
Xian, Y., Fu, Z., Muthukrishnan, S., de Melo, G., & Zhang, Y. (2019). Reinforcement 1 (long and short papers) (pp. 3016–3025).
knowledge graph reasoning for explainable recommendation. arXiv:1906.05237 Zhang, W., Paudel, B., Wang, L., Chen, J., Zhu, H., Zhang, W., et al. (2019b). Iteratively
Xiao, H., Huang, M., Hao, Y., & Zhu, X. (2015). Transa: an adaptive approach for learning embeddings and rules for knowledge graph reasoning. In The world
knowledge graph embedding. arXiv:1509.05490 wide web conference (pp. 2366–2377). ACM.
Xiao, H., Huang, M., & Zhu, X. (2016). Transg: A generative model for knowledge Zhang, Y., Dai, H., Kozareva, Z., Smola, A. J., & Song, L. (2018a). Variational reasoning
graph embedding. In Proceedings of the 54th annual meeting of the association for question answering with knowledge graph. Thirty-second aaai conference on
for computational linguistics (volume 1: Long papers): 1 (pp. 2316–2325). artificial intelligence.
Xie, Q., Ma, X., Dai, Z., & Hovy, E. (2017). An interpretable knowledge transfer model Zhang, Y., Dai, H., Toraman, K., & Song, L. (2018b). Kg^2: Learning to reason sci-
for knowledge base completion. In Proceedings of the 55th annual meeting of the ence exam questions with contextual knowledge graph embeddings. arXiv:1805.
association for computational linguistics (volume 1: Long papers) (pp. 950–962). 12393
Xie, R., Liu, Z., Jia, J., Luan, H., & Sun, M. (2016). Representation learning of knowl- Zhou, J., Ma, L., Liu, Q., Zhang, L., Yu, Y., & Pan, Y. (2006). Minerva: A scalable
edge graphs with entity descriptions. Thirtieth aaai conference on artificial intel- owl ontology storage and inference system. In Asian semantic web conference
ligence. (pp. 429–443). Springer.
Xiong, W., Hoang, T., & Wang, W. Y. (2017). Deeppath: A reinforcement learning Zou, Y., Finin, T., & Chen, H. (2004). F-owl: An inference engine for semantic
method for knowledge graph reasoning. In Proceedings of the 2017 conference on web. In International workshop on formal approaches to agent-based systems
empirical methods in natural language processing (pp. 564–573). (pp. 238–248). Springer.