0% found this document useful (0 votes)
34 views20 pages

Using Text Mining To Establish Knowledge Graph From Accidentincident Reports in Risk Assessment

Uploaded by

comeonitsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views20 pages

Using Text Mining To Establish Knowledge Graph From Accidentincident Reports in Risk Assessment

Uploaded by

comeonitsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Expert Systems With Applications 207 (2022) 117991

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

Using text mining to establish knowledge graph from accident/incident


reports in risk assessment
Chang Liu 1, Shiwu Yang *, 2
School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China

A R T I C L E I N F O A B S T R A C T

Keywords: To clarify the risk factors and propagation characteristics affecting railway safety, we learn from historical re­
Text mining ports to build a connected network of hazards and accidents, forming a knowledge graph (KG), and apply it to
Knowledge graph railway hazard identification and risk assessment. First, the open source-British railway accident/incident reports
Named entity recognition
are selected as the data source. The text augmentation algorithm in the text mining technology is introduced and
Machine learning
Risk assessment
optimized to achieve data enhancement. An ensemble model is constructed based on the hidden Markov model,
Railway safety conditional random field (CRF) algorithm, bidirectional long short-term memory (Bi-LSTM), and Bi-LSTM-CRF
deep learning network, completing the named entity recognition of the reports. Then, using the random forest
algorithm, the standardized classification of entities is accomplished, and the multi-dimensional knowledge
graph network is established. Finally, after defining a series of safety-related feature parameters, the obtained KG
is applied to the quantitative assessment of the corresponding risk level of the hazards. The results show that this
approach realizes the visualization and quantitative description of the potential relationship among hazards,
faults, and accidents by exploring the topological relationship of the railway accident network, further assisting
the formulation of railway risk preventive measures.

1. Introduction derailment of the Taiwan Railway Taroko Express (April 2021) (CNN,
2021), causing dozens or even hundreds of casualties, respectively.
Safety-critical is the essential requirement of railway transportation, These accidents mentioned above all reflect the deficiencies of the
and railway safety depends on the combined effect of rail traffic rules, railway safety operating guarantee system.
reliability of infrastructure and rolling stock, organizational safety cul­ To effectively prevent railway safety accidents/incidents caused by
ture, and human factors (Kyriakidis, 2013; Kyriakidis, Majumdar, & equipment quality and personnel factors, the traditional methods usu­
Ochieng, 2015). As a deep integration of mobile equipment, control ally implement system safety evaluation through risk assessment of
systems, and infrastructure, the railway system has a complex and various indicators (Lyu et al., 2020; Pan et al., 2020; Yazdi et al., 2020).
changeable operating environment, and its risk and failure modes often These methods generally rely on the experience of experts in related
present spatio-temporal randomness and diversity. Hazards affecting fields and are obviously subjective (Liu et al., 2020). Expert scoring data
railway safety might initiate chain reactions under other branches meeting both decision-making quality and quantity standards are also
through information and functional interactions among railway difficult to obtain under certain conditions. Therefore, obvious limita­
branches, leading to railway system equipment failures and even casu­ tions exist in quantitatively describing the relationship between hazards
alties. Influential catastrophic accidents include the derailment of the and system safety risks. On the other hand, in the traditional railway
German Eschede train (June 1998) (Sakamoto, 2008), the derailment of operation model, especially high-speed railway scenarios, safety-related
the Japan Railway (JR) Fukuchiyama Line (April 2005) (Takaoka et al., decisions depend on the experience of dispatchers and drivers. However,
2007), the “7.23 EMU accident on Chinese Yongwen high-speed rail­ dispatchers with varying quality levels usually carry out a heavy
way” (July 2011) (Wang et al., 2017), the train derailment of the San­ workload, so that they sometimes cannot effectively and rapidly process
tiago de Compostela in Spain (July 2013) (Shultz et al., 2016), and the multi-source heterogeneous information, as well as have difficulty

* Corresponding author.
E-mail addresses: [email protected] (C. Liu), [email protected] (S. Yang).
1
ORCID: 0000-0002-8213-9465.
2
ORCID: 0000-0003-0996-2006.

https://doi.org/10.1016/j.eswa.2022.117991
Received 24 October 2021; Received in revised form 13 June 2022; Accepted 26 June 2022
Available online 30 June 2022
0957-4174/© 2022 Elsevier Ltd. All rights reserved.
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

making real-time and accurate decisions. Obviously, drawing lessons explored in artificial intelligence (Yoo & Jeong, 2020), semantic parsing
from historical accidents and proactively identifying even predicting the (Rinaldi, Russo, & Tommasino, 2021), applications development (Net­
potential risks of various hazards are positive to improve the safety tleton & Salas, 2016), platform design (Lu et al., 2020), and other fields.
assurance level of the railway system. Although the railway system or In recent years, the researches on safety analysis have also introduced
equipment is gradually updated and upgraded with the advancement of KGs to realize knowledge modelling and risk management. Zhang et al.
technology, its inherent safety logic has not substantially changed. (2019) analysed the transportation chain of maritime dangerous goods
Additionally, empirical learning based on a large number of accidents and constructed corresponding KG. Luo et al. (2021) built a Bayesian
can mitigate the negative impact of possible improper decision-making network-based knowledge inference model for road transportation risks.
in a single report on risk assessment to a certain extent. Mao et al. (2020) designed a semi-automatic KG development solution
Risk is closely related to accident statistics, and the historical acci­ for process safety in the chemical industry. Jiang et al. (2021) used
dent data may offer reasonable estimation and prediction for the future knowledge storage to build a KG of construction safety standard.
(Wang et al., 2021). Road traffic widely adopted data-driven accident/ Focusing on road safety, Halilaj et al. (2021) proposed a KG modelling
incident analysis (Mannering, 2018; Mannering et al., 2020; Parsa et al., method suitable for situation understanding tasks as required in driver
2019; Sangare et al., 2021). In railway safety, the approaches to learn assistance and automated driving systems. Evidently, the exploration of
from accident reports usually include causation analysis and statistical knowledge and its links is one of the emerging technologies in the en­
analysis of historical data (Liu et al., 2019). Li and Wang (2018) pro­ gineering field and has been well applied.
posed a risk monitoring model based on a complex network in railway On the contrary, knowledge-related research on railway safety ap­
operation. Liu and Li (2018) gave a new cascading failure model to pears to be relatively scarce. Bischof and Skinner (2021) introduced the
analyse the causes of railway accidents quantitatively. Lam and Tai KG development of the rail infrastructure ontology based on the topol­
(2020) submitted a network analysis framework to identify the factors ogy model. Similar research on infrastructure lacks in-depth discussions
and effects of railway incidents. Li et al. (2019) proposed a hybrid on the safety field. Zhao et al. (2014) put forward a text mining-based
human and organizational analysis method based on systems-theoretical fault diagnosis method for vehicle on-board equipment of the high-
accident modelling and processes (STAMP)-human factors analysis and speed railway. Heidarysafa et al. (2018) used deep learning to classify
classification system (HFACS) to identify and analyse human and accident causes based on report narratives. Hua, Zheng, and Gao (2019)
organizational factors involved in railway accidents. Chen, Xu, and Ni applied the NER method in natural language processing (NLP) to extract
(2017) adopted rough set theory and associated rules to realize the data the risk factors of railway accidents. The above studies all focused on the
mining on Chinese train accidents. Zhou and Lei (2018) filtered railway hidden dangers and potential risks of the railway operation process and
accident/incident reports through the HFACS framework, presenting the selected the relevant report text as the data source for knowledge
occurrence paths of a railway accident/incident and revealing signifi­ extraction and discovery, similar to our research ideas. Nevertheless,
cant interdependences among human factors. they did not give the complete construction of the knowledge system
However, most of the current studies mainly focus on a single- and the visualization of the knowledge links, difficult to provide intui­
dimensional homogeneous network of the cause of the accident, only tive and efficient decision-making guidance for the field staff. Hughes
suitable for describing the single-oriented relationship among the same et al. (2019) carried out a relatively thorough study. They proposed a
type of nodes (Qiu et al., 2018), and not applied to the heterogeneous method to extract meaning from multi-lingual free-text safety incident
railway network with multi-dimensional and multi-layer branch func­ reports, realizing the storage and query of text records based on a graph
tion coordination. Benchimol, Kazinnik, and Saadon (2022) reviewed database. But they did not extend the knowledge-level processing to the
various text mining methods and introduced their application in prac­ quantitative research of risk assessment, whereas this process is the
tical cases, providing ideas for text-based research. To further explore a main breakthrough we expect to achieve.
multi-dimensional network topology analysis method appropriate for As one of the most typical and essential tasks in the TM technology of
railway safety accidents, this paper proposes a novel knowledge graph the NLP community (Habibi et al., 2017), NER is a direct approach to
(KG)-based network modelling and analysis approach. Based on a large complete knowledge discovery from text data, has been recently widely
number of accident report text data processed by formatting and text applied in medical (Wu et al., 2021), social media (Gozuacik, Sakar, &
augmentation (TA), the named entity recognition (NER) algorithm Ozcan, 2021), maritime (He, Sun, & Wang, 2021), and other fields. A
model is established and optimized through integrated deep learning, named entity refers to an entity that contains a certain special meaning
mining the potential hazard-related entities. Next, we try to standardize or indexicality in a sentence. Common types include the name of a
the causal transmission path among entities, establish a Risk Knowledge person, geographic name, institutional organization, and time expres­
Graph in Railway Safety (RKGRS), define risk-related indicators, and sion. The NER algorithm aims to accurately identify the location of an
realize the quantitative mapping among multi-level hazards and hazards entity from unstructured text information and assign it to the correct
to risks. category (Kwon, Ko, & Seo, 2019). However, unlike conventional
This paper is organized as follows. Section 2 reviews some related research, the risk-factor-oriented hazard-related entities we are con­
literature and work. Section 3 introduces the modelling method and cerned about are not general existing entities (e.g., person, geographic
process of RKGRS. In Section 4, based on the British railway accident name, etc.), so in addition to “recognition”, the designed algorithm also
reports, by comparing and training different machine learning and deep needs to “understand” the semantics itself. This process is similar to slot
learning models, combined with TA-based data enhancement and text filling in natural language understanding (NLU), i.e., identifying se­
mining (TM) technology, the optimal NER algorithm ensemble model is mantic slots and understanding the corresponding intents (Tang, Ji, &
constructed. Section 5 utilizes the random forest (RF) algorithm to Zhou, 2020). To sum up, in this paper, we build a NER (in a broad sense)
classify hazardous entities, constructs the RKGRS model based on stan­ algorithm model by defining hazard-related entities containing the
dardized entity categories. Section 6 defines the safety-related feature characteristics of text reports to achieve in-depth mining of accident/
parameters based on the topological network relationship and completes incident information.
a series of hazard-based quantitative calculations and risk assessments.
Finally, Section 7 concludes the paper. 3. Methodology

2. Related work Knowledge is structured and organized information after cognitive


processing and verification, so it can reduce the risk rate or avoid risks as
At present, as an emerging network relationship visualization tech­ much as possible for the enterprise and all its stakeholders in the risk
nology (Kejriwal, Sequeda, & Lopez, 2019), KG has been widely management process (Duan et al., 2017). Taking concepts, entities, and

2
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Fig. 1. RKGRS modelling scheme and process.

literals as nodes and different kinds of relationships as edges, the their integration to maximize the fitting performance (see Section 4.2).
knowledge collection can be represented in the form of graphs (Wu In addition, concerning that the writing format and style of the reports
et al., 2018), i.e., the modelling of KGs. are not uniform, we should standardize the entities obtained from NER
Fig. 1 shows the modelling methodology of the RKGRS, and specific by classification (see Section 5.1).
processes are as follows:
Step 1: select historical accident/incident reports as the original data 4. Named entity recognition (NER)
set.
Step 2: extract knowledge entities from structured or semi-structured 4.1. Corpus generation
data, including hazard related and unrelated entities.
Step 3: discover the semantic relationship among entities or concepts GOV.UK official website continuously released a series of Rail Ac­
from different sources and finally describe the causal relationship cident Investigation Branch reports, including 427 useful accident/
among hazards, risks, and accidents. incident information from 2005 to 2020, covering the review and
Step 4: fuse knowledge from different data sources and normalize description of accidents in railway operations, rolling stock, signalling
entities, attributes, and attribute values with the same meaning but system, traction power supply, etc. Specifically, the accident types
different string labels according to specific rules and RF algorithms. involve derailment, collision, runaway, electric shock, near miss, etc.
Step 5: describe different knowledge entities E1 and E2 and their This open-source text library can provide available data support for our
relationship R in the form of triples (E1 , R, E2 ) and the complete network research. Accordingly, we choose the aforementioned historical railway
topology model of KG = (E, R). accident reports as the original data source for text mining and knowl­
Specifically, since we select the official railway accident reports as edge extraction, completing the generation of the corpus through the
the data source, the following issues should be noticed in the process of following steps.
text processing and information mining. On the one hand, the decision-
making analysts for accident management should be experienced in 4.1.1. Focused web crawler
railway industry, and the descriptions of the complicated and variable The web crawler is currently the mainstream text acquisition tech­
types of accidents, failures, and hazards on the railway scene under nology, based on Python programming, implementing automatic
different experts are fairly diversified. On the other hand, the railway is crawling of web page information. Zhang et al. (2017) designed the
different from other engineering systems. The significance and charac­ focused web crawler algorithm to improve the accuracy and speciality of
teristics of railway transportation determine its strict requirements for text collection in data preprocessing. The report webpages of GOV.UK
safety. Obviously, it is necessary to maximally improve the model’s official website not only contain the accident/incident information we
learning accuracy during algorithms designing. are concerned about, but also cover many irrelevant items (e.g., Safety
In summary, we need to perform a series of processing on the original Digest, Recommendations, and Explore the Topic, etc.). Thus, during the
data set to generate a corpus advisable for learning (see Section 4.1). crawler task, to achieve accurate mining of useful text information, we
When designing the NER algorithm, we will build different models and need to download specific content from the relevant topics of the seed

3
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Table 1 Table 2
Entity categories. Concise introduction of 17 TA approaches.
Entity Name Annotation Tag Algorithm Core Basis Description Implementation
Principle
Time TIM
Location LOC 1 OCR_Aug OCR error Augmenter that Substitute
Incident cause CAU simulate OCR character by pre-
Incident description DES error by random defined OCR error
Incident consequence CON values
Running speed SPE 2 Keyboard_ Typo error Augmenter that Substitute
Unconcerned entity O Aug apply typo error character by
simulation to keyboard distance
textual input
URL in a targeted manner. Further, using the main approaches of regular 3 RandomChar_Aug Random Augmenter that Insert character
character simulate typo randomly
grammar and keyword recognition, we formulate a collection strategy
4 error error by random Substitute
focusing on the predetermined topic, design programs to filter text, values character
pictures and links that are irrelevant to the target subject in the web­ randomly
page, extract only the main text related to the accident information, and 5 Swap character
finally save it in a standardized format. randomly
6 Delete character
randomly
4.1.2. Data annotation 7 Spelling_Aug Spelling Augmenter that Substitute word by
Semantic annotation provides an in-depth understanding of the error leverage pre- spelling mistake
hidden values of the semantic web data, significantly concerned in big defined spelling words dictionary
data knowledge management (Rani, Suresh, & Sethukarasi, 2019). Su­ mistake
dictionary to
pervised deep learning algorithms require a large amount of labelled simulate spelling
data to identify and locate valuable knowledge effectively. Taking word- mistake
tag as the combined unit, we employ the BMES labelling method to tag 8 WordEmbs_ Word Augmenter that Insert word
the normalized accident text data manually. Specifically, this method Aug embeddings leverage word randomly by word
embeddings to embeddings
applies Begin, Middle, and End to respectively denote the beginning,
find top n similar similarity
middle, and end words of a complete named entity in the form of a 9 word for Substitute word by
phrase, and Single to represent an independent entity, while other un­ augmentation word embeddings
related entities are all tagged as O. According to the accident-related similarity
hazards and risk analysis requirements, we define the entity categories 10 TFIDF_Aug TF-IDF Augmenter that Insert word by TF-
statistics leverage TF-IDF IDF similarity
in the reports as shown in Table 1. Taking the DES entity as an example, 11 statistics to Substitute word by
its tags can be B-DES, M-DES, E-DES, and S-DES, e.g., before the anno­ insert or TF-IDF similarity
tation: struck a railway worker on the track, after the annotation: struck substitute word
(B-DES) a (M-DES) railway (M-DES) worker (M-DES) on (M-DES) the 12 Contextual Contextual Augmenter that Insert word by
WordEmbs_Aug word leverage contextual word
(M-DES) track (E-DES).
embeddings contextual word embeddings
embeddings to (BERT)
4.1.3. Text augmentation (TA) based data enhancement 13 find top n similar Substitute word by
Big data driving is the fundamental guarantee of neural network’s word for contextual word
fitting performance and plays a highly significant part in the conduction augmentation embeddings
(BERT)
of NLP technology. Web crawlers only acquire a small data set composed 14 Substitute word by
of online accident reports, and hard to provide sufficient data support contextual word
for machine/deep learning. Text augmentation can enrich text datasets embeddings
to gain higher performance compared to the one using the original text (DistilBERT)
15 Substitute word by
dataset (Body et al., 2021). Thanks to the designer’s model preference,
contextual word
TA can efficiently regularize the original text to enhance the general­ embeddings
ization and robustness of the training model. Typical ideas for imple­ (RoBERTA)
menting TA include back translation, easy data augmentation (EDA), 16 Synonym_ Semantic Augmenter that Substitute word by
improved EDA, context information based TA, and language modelling Aug meaning leverage WordNet’s
semantic synonym
broadened to account for discourse aspects (LAMBADA), etc. To explore meaning to
the optimum TA algorithm for text mining of the accident reports, we substitute word
use 17 different approaches to process the original text data, as shown in 17 Split_Aug Word Augmenter that Split word to two
Table 2, where the first 6 approaches are at the character level, and the splitting apply word tokens randomly
splitting for
rest are at the word level. The augmentation mechanism is roughly using
augmentation
different core bases to add, delete, or replace into random characters/
words (or synonyms) at random positions in the sentence. For details of 1. OCR: optical character recognition; 2. TF-IDF: term frequency inversed
the contribution of the text data processed by each approach to the document frequency; 3. BERT: bidirectional encoder representations from
transformers; 4. DistilBERT: distilled BERT; 5. RoBERTA: robustly optimized
fitting accuracy in the NER model in this paper, refer to Section 4.3.
BERT pre-training approach.
(Besides, we adopt the geographic name text library for additional
expansion for the entity data of the LOC type.) The total data size after
augmentation exceeds 200,000 rows. For instance, Fig. 2 shows the sequences. The neural network model maps out the statistical structure
distribution of different entity types after text augmentation by TA of human language, applying pattern recognition to words, sentences,
approach 1 in Table 2. and paragraphs to understand and solve text problems. Thus, we need to
convert the original text into a numerical tensor for neural network
4.1.4. Text vectorization learning. Common methods are one-hot encoding and word embedding
Text data can be regarded as a collection of characters or word (Krishnan & Jawahar, 2020).

4
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

on learning algorithms), and a test one (used to evaluate on the best


model). Dividing the available data into these three subsets is an
important part of artificial neural network model development and can
significantly affect the final model performance even more than the
variability caused by other development factors (e.g., model structure
selection, random weight initialization, and training) (Anctil & Lauzon,
2004; LeBaron & Weigend, 1998). Noticeably, data splitting does not
necessarily follow the specified percentage exactly (e.g., the data size
distribution of the three subsets can be 60% : 20% : 20% (Wu et al.,
2013), 65% : 25% : 10% (Tibrewala et al., 2020), 66% : 17% : 17%
(Ruan et al., 2022), etc.). Therefore, we can initially optimize the model
performance by trying different splitting ratios. After debugging and
performance comparison, we determine that the ratio of the three sub­
sets in this paper is 12 : 3 : 1.
In summary, we have gained corpus data of vectorised accident
report with adequate data size. Taking part of the accident report “Ac­
cident at Leatherhead” as an example, Fig. 3 depicts the above process.

Fig. 2. Distribution of different entity types after text augmentation.


4.2. NER algorithm modelling

4.1.5. Data set splitting


To as precise as possible accomplish the mining of the knowledge
After the above text pre-processing, we divide the large data set
entities from the accident reports, we take into account 4 algorithms,
generated into a training one (used to train algorithms), a development
including machine learning (HMM and CRF) and deep learning (Bi-
one (used to adjust parameters, select features, and make other decisions
LSTM and Bi-LSTM-CRF), and consider their ensemble model, finally

Fig. 3. Corpus generation process and an example.

5
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Fig. 4. Structure diagram of Bi-LSTM network.

(a) . (b) Accuracy.


Fig. 5. Fitting performance under different epoch of Bi-LSTM network.

Fig. 6. Bi-LSTM-CRF process example for NER.

6
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

of any row in the hidden state transition probability matrix is demon­


strated in Eq. (1), and the MLE of any element Ptij (i, j = 1, ⋯, K) is shown
in Eq. (2), where Nij is the number of transition from the hidden state i to

j, K is the number of states (state k ∈ K), and Kj=1 Ptj = 1.
( )
L Ptj = Pt 1 N1 Pt2 N2 ⋯PtK− 1
NK− 1
(1 − Pt1 − ⋯ − Ptk− 1 )NK , (1)
/
P̂t ij = Nij (Ni1 + Ni2 + ⋯ + NiK ). (2)

Fig. 7. Structure diagram of ensemble model. Additionally, we adopt the Viterbi algorithm (VA) to solve the state
sequence for a given observation sequence (Hanif & Zimmermann,
compare the fitting performance of the above algorithm models through 2017), figuring out the path of maximum probability (Viterbi path)
training on a given data set. In addition, the fitting evaluation indicators through dynamic programming. Each path corresponds to a state
we adopt are precision, recall, and f1 score (Romijnders et al., 2021). sequence, denoted as x1 , x2 , ⋯, xT . Then, according to the recurrence
relationship, we can derive.
4.2.1. Hidden Markov model (HMM) {
V1,k = P(O1 |k( )∙P0 k
HMM is a statistical model using a Markov process with hidden states Vt,k = P(Ot |k )∙max Pt x,k ∙Vt− 1,x
) (3)
and studying the observed items from a discrete-time series in real-world x∈S

applications and communities (Mor, Garhwal, & Kumar, 2020). Any


HMM can be defined as a five-tuple (O, S, P0 , Pt , Pe ), where O represents where Vt,k is the probability of the state sequence that the first t (t = 1, 2,
the observation sequence, S represents the hidden state, and P0 , Pt , and ⋯, T) observations whose final state is k are most likely to match.
Pe represent the initial probability, transition probability, and emission We define a function Ptr(k, t), when t > 1, it returns the parameter x
probability (the probability that the hidden state manifests as an required to calculate Vt,k ; when t = 1, it returns k. Finally, we obtain.
observable state) of the hidden state, respectively. {
xT = argmaxVT,x
We define the text tags as the state sequence and the pre-processed x∈S (4)
xt− 1 = Ptr(xt , t)
text as the observation sequence, then introduce the maximum likeli­
hood estimation (MLE) (Schuler & Rose, 2017) method to solve the
4.2.2. Conditional random field (CRF)
HMM parameters. The likelihood function of the transition probability
CRF is a discriminative probability model commonly used in NLP

(a) . (b) .

(c) .
Fig. 8. NER accuracies under different algorithmic model.

7
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Fig. 9. Confusion matrix of the optimal ensemble model.

tasks. It can not only overcome the independence hypothesis of the 4.2.3. Bidirectional long Short-Term memory (Bi-LSTM)
HMM, but also solve the problem of inductive bias of the maximum The NER task of applying deep neural networks for text processing
entropy Markov model, using a global optimization method (Song, has become a hot spot in NLP research in recent years. LSTM is an
Zhang, & Huang, 2017). improved deep learning network structure based on RNN design,
CRF is an undirected graph model, using vertices to indicate random effectively refraining the loss of historical information and the disap­
variables, and edges to represent the relationship among random vari­ pearance of gradients in NLP. The LSTM network consists of “forget
ables. For a given observation sequence (text sequence) O = gates,” “input gates,” and “output gates.” Eqs. (8)-(10) illustrate the
{O1 , O2 , ⋯, OT }, the corresponding state sequence (tag sequence) S = forward propagation process of these 3 kinds of logic gates, respectively.
{x1 , x2 , ⋯, xT } is determined by Eq. (5). ( )
( ) ft = σ Wf [ht− 1 int ] + bf (8)
1 ∑ T ∑
P(S|O ) = exp λk fk (xi , Oi ) , (5) ⎧
⎨ it = σ(Wi [ht− 1 int ] + bi )
Z(O)
(9)
i=1 k
s = tanh(Ws [ht− 1 int ] + bs )
⎩ t
where Z(O) is the normalization factor, so that the probability sum of ct = ft ⊙ ct− 1 + it ⊙ st
state sequence is 1; fk (xi , Oi ) is the state characteristic function, and λk is ⎧
the mapping correlation weight. For any given text word and its tag, fk (xi , ⎨ ot = σ(Wo [ht− 1 int ] + bo )
Oi ) is shown in Eq. (6). ht = ot ⊙ tanhct (10)

{ out = softmaxht
1, Oi = word and xi = tag
fk (xi , Oi ) = . (6) where int and out represent input and output, respectively; it , ft , and
0, else
ot are the unit statuses of the input, forget, and output gate at time t,
Then the tag sequence with the highest probability can be obtained. separately; st is temporary state quantity; ct is the neuron at the moment
The predicted result of entity recognition is. t; ht represents the vector of the hidden layer at time t; Wf , Wi , Ws , and
S* = argmax{P(S|O ) }. (7) Wo are the corresponding weight matrixes; bf , bi , bs , and bo are the
S
corresponding bias terms, respectively; ⊙ indicates array multiplication;
σ is the sigmoid function.

8
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

(a) Standardized categories of CAU entity.


Fig. 10. Standardized categories of hazard related entities.

As an extension of conventional LSTM (Guo et al., 2016), Bi-LSTM e., the model learning ability is too strong, resulting in a decline in
can learn bidirectional long-term dependencies among time steps of generalization ability, high training accuracy, but poor testing accuracy)
time series or sequence data and retain the past and future information occurred. The model also achieved the optative fitting accuracy, i.e.,
at any moment. The prediction result of Bi-LSTM contains the infor­ precision = 95.70%, recall = 95.19%, and f1 score = 95.11%.
mation of the input sequence in both the forward and backward order,
and the weight is reused, reducing the risk of under-fitting (i.e., insuf­ 4.2.4. Bi-LSTM-CRF
ficient model fitting, resulting in poor training and testing accuracy). To explore better text embedding and model training effects, we
Fig. 4 reveals the prediction and output process of the Bi-LSTM network. supplement a CRF layer on the basis of Bi-LSTM, as shown in Fig. 6. The
In addition, the fitting performance of the LSTM network is affected model constructs the feature representation of the text sequence through
by the parameter epoch to a certain extent, thus we optimize the epoch on the word embedding layer. After the feature vector is passed to the Bi-
account of the corpus obtained by TA approach 1 in Section 4.1. The LSTM layer, the hidden state of the character feature vector will be
result is shown in Fig. 5, where val loss is the verification loss, and the generated. Finally, after the hidden layer completes the feature marking,
accuracy indicators include precision, recall, and f1 score. When epoch = the decoding of the text sequence’s prediction tag is accomplished in the
15, the val loss of the model reaches the lowest, without over-fitting (i. CRF layer. Similar to Section 4.2.1, we still take VA for decoding.

9
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

(b) Standardized categories of DES entity.


Fig. 10. (continued).

4.2.5. Ensemble model construct an ensemble model of the above 4 algorithms, as shown in
Concerning that ensemble learning is a valid approach to enhance Fig. 7. Specifically, the ensemble model takes the word in the text
prediction accuracy and decompose complex and challenging learning sequence as input, respectively predicts the corresponding category
problems into simpler sub-problems (Krawczyk et al., 2017), we through four single models (sub-algorithms), completes the voting based

10
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

(c) Standardized categories of CON entity.


Fig. 10. (continued).

(a) ROC curve of CAU entity classification. (b) ROC curve of DES entity classification.

(c) ROC curve of CON entity classification.


Fig. 11. ROC curves of RF model classification.

11
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Fig. 12. RKGRS visualization.

on the given voting mechanism (we choose the major voting rule), and
finally outputs the predicted tag. (x, y, z) =(number of TA approach, number of NER approach,
In Fig. 7, the majority voting rule requires that more than half of the highest point of the curve)
votes of a certain label value are predicted as the class of the label; By comparing the 3 accuracy evaluation indicators, it can be derived
otherwise, the prediction is rejected, as shown in Eq. (11). that the NER algorithm 5 (i.e., the green curve: ensemble model) sat­
{ ∑T j ∑N ∑T isfies the optimal fitting capability under the text preprocessing by TA
cj , if h (x) > 0.5 k=1 hk (x)
H(x) = i=1 i i=1 i (11) approach 8, i.e., precision = 98.12%, recall = 98.30%, and f1 score =
reject, otherwise 98.16%.
where hi (x) represents the sub-algorithm model, T is the number of Regarding the obtained optimal corpus and NER model, we employ
classifiers, N is the number of classes, and cj is the prediction result for the confusion matrix to describe the visualized result of the algorithm
class j. performance (Ruuska et al., 2018), as shown in Fig. 9. Apparently, most
of the prediction results are consistent with the actual tags, verifying the
4.3. Algorithm verification effectiveness and accuracy of the NER algorithm. Moreover, the pre­
diction of entities with the labels “S-SPE,” “S-CON,” and “S-TIM” present
After the above work, we have obtained an accident report corpus relatively higher error rates, which are 39.57%, 33.33%, and 26.67%,
enhanced with 17 TA approaches and 5 NER algorithm models. We respectively. According to the annotation rule in Section 4.1.2, the entity
perform actual training and prediction after traversing these data sets corresponding to the tag starting with “S” only use one single word to
and models based on different combinations, and Fig. 8 shows the re­ express complete meaning. However, the entities of “running speed,”
sults. (Besides, we also have tried to test the performance of the NER “incident consequence,” and “time” rarely appear in the form of a single
models mentioned above on the unaugmented corpus, and the accu­ word, resulting in an imbalance in the distribution of corpus data and
racies were all not satisfactory, while the highest is only about 70%.) reducing the prediction accuracy of corresponding category entities.
The meaning of the corresponding coordinate label above each curve in
Fig. 8 is.

12
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

(a) distribution.

(b) distribution.
Fig. 13. degree and weighted degree distributions of hazardrelated entities.

5. RKGRS visualized completion bility and excellent fitting ability for classification. It forms an integrated
model through multiple random decision trees, and is virtually a special
5.1. Random forest (RF) based entity classification ensemble learning algorithm with a bagging framework (Zhou & Qiu,
2018). Thus, we manually annotate the recognized entities and split the
The above research results show that the unstructured text sequences training set and the test set into 4 : 1, adopting RF to fulfil entity clas­
representing various entities output by the NER model present obvious sification. Also, Fig. 10 indicates the correspondence between the entity
inconsistent styles and rhetoric. Some entities recorded as different text texts before and after the annotation. For example, in Fig. 10 (a),
sequences may correspond to the same meaning, e.g., “vehicle intru­ “Response delay of the driver,” “Incorrect response to previous caution
sion,” “pedestrian intrusion,” and “debris on the track” can all be signal,” and “Insufficient response to earlier report” are all labelled
collectively expressed as “foreign object intrusion.” Displaying all “C03: missing, belated, or incorrect response to commands or hazards.”.
redundant texts with the same meaning will greatly increase the GridSearchCV, an automatic parameter adjustment tool integrated
complexity of the knowledge graph and reduce the readability of into Python to optimize the RF model so that the optimum parameter
knowledge links. Hence, consistent with Section 3, it is necessary to combination is: max features = 7 (maximum number of features),
standardize the knowledge entities extracted from the original accident n estimators = 5 (numbers of decision tree), and random state = 17
reports to finish knowledge fusion with a more precise integration of the (random seeds).
target nodes. The receiver operating characteristic (ROC) curve takes false positive
To explore and finally quantitatively depict the causal linking among rate (FPR) and true positive rate (TPR) as coordinate axes. The larger the
hazards, risks, and accidents around railway safety, we focus on the area enclosed by the curve and the coordinate axes, the better the model
hazard-related entities derived by NER featuring CAU, DES, and CON. performance (Perez et al., 2019). We utilize ROC to evaluate the clas­
Based on the accumulation of professional knowledge and specialist sification effect of the optimized RF model visually, and Fig. 11 shows
research in the railway field, we define the standardized categories of the result. Evidently, the model exhibits high-precision prediction per­
these 3 entities and the corresponding weights of CON entities (corre­ formance in all 3 entity classification tasks.
sponding to the wk in Section 6.3, Eq. (18)), as shown in Fig. 10.
The RF algorithm is emerging in machine learning with high flexi­

13
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Since Gephi has powerful data manipulation and graph construction


capabilities, we only need to form an Excel file in accordance with the
prescribed format with the standardized entity nodes and relationship
links and import it into Gephi, then the graph can be automatically
generated. Also, proper parameter adjustment can further improve the
visualization performance of the graph.
Fig. 12 reveals the visualization results of the causal-oriented rela­
tionship among hazard-related entities, where the node’s size represents
the parameter degree, and the width of the edge represents the weight of
the path. The codes in the nodes correspond to the standardized cate­
gories in Fig. 10. The average values of degree and weighted degree of this
graph are 4.831 and 12.028, respectively, and the average path length is
2.519. An example of a complete knowledge link is.
C06 (imperfect management of maintenance practices)
→D01 (struck − by − object or collision)
→D07 (derailment)
→K03 (damage to structure, component, or device)
→K01 (injuries)
(a) Active/passive causal closeness of CAU entity. The concrete degree and weighted degree distribution of each entity
node is shown in Fig. 13. Taking the above results into consideration
comprehensively, we can initially conclude that C01, C02, C12, and C20
in CAU entity, D01, D07, D13, and D22 in DES entity, and K01, K02, and
K03 in CON entity, have the most relationship links in RKGRS, or rela­
tively highest contribution level to the occurrence of railway safety
accidents.

6. Application: risk assessment

Specific parameter calculation and network topology analysis are


carried out for the hazard-related entities and their relationship chains
in RSRKG, which can realize the quantitative assessment of risk levels.
Moreover, to probe into the potential characteristics of historical acci­
(b) Active/passive causal closeness of DES entity. dent data by analysing the multi-dimensional topological network
model of the RKGRS and the quantitative description of the relationship
among multiple types of knowledge entities, we define a series of safety-
related feature parameters. To facilitate calculation and reference, we
denote the network node-set as Nodes = {C, D, K}.

6.1. Active/Passive causal closeness

Given an entity h ∈ Nodes, due to the function transfer within each


branch of the railway and the connectivity among branches, it may form
a chain of causality with entities of the same or distinct type. Therefore,
we use active causal closeness and passive causal closeness to indicate
the degree of difficulty that entity h directly or indirectly causes other
entities to occur, and the difficulty degree caused by the occurrence of
other entities, respectively, denoted as CAh and CPh , shown in Eqs. (14)
and (15). The greater the calculated value, the more probable the linking
of the path and direction will occur.
/( / )
∑ ∑
(c) Active/passive causal closeness of CON entity. CAh = 1 dist minhi dist truehi (14)
i∈Nodes i∈Nodes

Fig. 14. Active/passive causal closeness of hazard related entities. /( / )


∑ ∑
CPh = 1 dist minih dist trueih (15)
5.2. RKGRS visualized completion using Gephi i∈Nodes i∈Nodes

Gephi is a tool mainly for data analysts to understand complex where i is any entity node in the RKGRS network except h; dist min
graphs more readily (Thirumalai, Sree, & Gannu, 2017). In the con­ represents the shortest path from its subscript node 1 to node 2; dist true
struction of RKGRS, we take nodes to indicate hazard-related entities, represents whether there is a causal path between its subscript node 1
edges denote the linking among entities, the degree of a node represents and node 2, only if the path exists, dist true = 1, otherwise, dist true =
the number of links the node has (degree = indegree + outdegree), and the 0.
weighted degree of the node considers the weight of the edge: According to Eqs. (14) and (15), the calculated active/passive causal
closeness of nodes with entity types CAU, DES, and CON are shown in
degree = indegree + outdegree (12) Fig. 14. Some nodes present high passive causal closeness values, such as
C03, C04, C08, C09, and C10 in CAU entity, and D09, D23, D25, D26,
weighted degree = weighted indegree + weighted outdegree (13) and D31 in DES entity, etc., mostly initiated by other hazards, can be

14
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

(a) Immediate successor/predecessor proportion of CAU entity.


Fig. 15. Immediate successor/predecessor proportion of hazard related entities.

regarded as accumulation hazards. Conversely, some nodes exhibit high same type. The probability of DES entities directly causing hazards of the
active causal closeness values, such as D04, D06, D16, D17, and D20 in same type and CON type is almost equal, and the proportion of its pre­
DES entity, and K04, K05, and K07 in CON entity, considered as source decessor nodes CAU or DES entities is close to half. As for the CON en­
hazards, may bring substantial risk consequences for railway safety. tity, its successors are of the same type, and there are a few cases caused
Also, entities C05, D24, D26, and D33 have almost equal active and by DES entities. Fig. 15 illustrates the specific calculation results of the
passive causal closeness, mainly in the middle position of the causal link. distribution of each hazard node. Exploring the ratio of immediate
successor/predecessor is helpful to analyse the linking among the haz­
6.2. Immediate Successor/Predecessor proportion ards. Besides, combining with the closeness-based hazard entities’
scatter diagram in Fig. 14, this analysis can provide theoretical support
Considering that a given entity h may be any link in the RKGRS for formulating specific safety strategies for each hazard.
network, we define immediate successor proportion to represent the
proportion of type T among all entity types that entity h can directly
cause, and immediate predecessor proportion to indicate the proportion 6.3. Risk indicators
of type T in all entity types that may directly cause entity h, recorded as
PShT and PPhT respectively, as shown in Eqs. (16) and (17). These two The risk indicator is a comprehensive estimation of the difficulty of a
parameters can reflect the connectivity among entity types. certain hazard entity h directly leading to the occurrence of the incident
∑ ( ) consequence entity K in the RKGRS network and the severity of the
PShT =
j∈Nodes dist truehj ∙typejT
∑ (16) consequences. We denote this parameter as Rh , as shown in Eq. (18).
i∈Nodes dist truehj ( ( ))

∑ ( ) Rh = normalization freh ∙ dist truehk ∙P(k|h )∙wk , (18)
dist truejh ∙typejT
(17)
j∈Nodes k∈K
PPhT = ∑
i∈Nodes dist truejh
where freh represents the frequency of occurrence of hazard h; k
only if the entity type of node j is T, typejT = 1, otherwise, typejT = 0. belongs to the K-type entity node of the RKGRS network; P(k|h ) is the
According to Eqs. (16) and (17), we figure out the hazard entity’s probability of consequence k caused by hazard h, and P(k|h ) =
immediate successor/predecessor proportion as shown in Fig. 15. P(kh)/P(h); wk means the weight of the consequence k, and we assign
Clearly, the direct successor of CAU entities are mostly DES types, and the weight according to the severity of the corresponding consequence;
the immediate predecessors of CAU entities are all hazard entities of the the normalization function is introduced for interval division.

15
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

(b) Immediate successor/predecessor proportion of DES entity.

(c) Immediate successor/predecessor proportion of CON entity.


Fig. 15. (continued).

16
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

(a) (Intermediary) risk indicator values of CAU entity.

(b) (Intermediary) risk indicator values of DES entity.

(c) (Intermediary) risk indicator values of CON entity.


Fig. 16. Risk indicator values of hazard related entities.

17
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Noticeably, hazard h may also perform as an intermediary in the connectivity relationships provide a substantial body of knowledge
linking where the accident occurs. Therefore, we define the parameter for clarifying risk factors and propagation mechanism. By defining
RBh to describe the intermediate risk level of a certain hazard on railway and calculating safety-related feature parameters, the potential risks
safety, as shown in Eq. (19), where Ri refers to the risk indicator in Eq. and consequences of any entity node to railway safety can be eval­
(18). In risk prevention and safety management, the risk can be reduced uated. Through a series of calculations, the quantitative levels of
by cutting off the propagation link of the intermediary hazard factors. various hazards are obtained, and the impact of hazards on railway
( ) safety is quantitatively illustrated, i.e., the application of RKGRS in

RBh = normalization (dist trueih ∙dist truehk ∙Ri ) . (19) risk assessment is realized.
i∈C∪D,k∈K 4) The methodology of this paper can be extended to apply to the
corpus composed of accident/incident reports from other sources,
The weight distribution of the CON entity (see Fig. 10 (c)) can reflect
provided that designing and corresponding preprocessing have been
the severity of the accident consequences. To quantitatively analyse the
finished before constructing the data set.
risk level of railway safety accidents caused by CAU and DES hazards,
we apply Eqs. (18) and (19) to calculate the corresponding risk in­
In conclusion, the RKGRS proposed in this paper is helpful to effec­
dicators. The results of the quantitative indicators Rh and RBh in the
tively and intelligently extract a great deal of information and decision-
equations can directly correspond to the qualitative description of the
making experience from historical incident/accident reports and can be
hazard’s risk level, and the conversion relationship is risk level =
successfully applied in risk assessment, expected to improve the effi­
{negligible|Rh ∈ [0, 0.25) , tolerable |Rh ∈ [0.25, 0.5) , undesirable|Rh ∈
ciency of risk management and prevention for unknown events in the
[0.5, 0.75) , intolerable|Rh ∈ [0.75, 1] } (Rh can be replaced with RBh ).
future.
Fig. 16 reflects the risk assessment results of the hazard-related en­
tities. C01, C12, C20, D01, and D07 have relatively high risk values and
Funding
are also intermediary risks that should be paid attention to. Although
C15 belongs to “negligible risk,” it also plays a significant role in the
This work was supported in part by the Ministry of Science and
intermediary of accidents. Investigating the comprehensive assessment
Technology of the People’s Republic of China [grant number
results of each hazard risk level in Fig. 16 (c), C12, C20, and D01 are
W21B05300031] and the National Railway Administration of the Peo­
“intolerable risk,” C01, C15, and D07 are “undesirable risk,” and the
ple’s Republic of China [grant number SJ2021-041].
remaining hazards belong to “tolerable risk” and “negligible risk.”
Railway is a typical safety–critical system, and timely or advanced
identification and elimination of hazards is the key technology to realize CRediT authorship contribution statement
active safety. Consequently, the prevention and control of high-risk
hazards should be fully considered during formulating modes and stra­ Chang Liu: Conceptualization, Software, Methodology, Validation,
tegies for safety management. Similarly, the hazard entity that acts as an Writing – original draft, Data curation, Resources. Shiwu Yang:
intermediary section in the accident chain should also be taken delib­ Conceptualization, Methodology, Supervision, Writing – review &
erately in the risk assessment process. Eliminating the intermediary editing, Funding acquisition.
hazard node in RKGRS can productively curb the linking of accident
occurrence and risk propagation, ultimately reducing the risk in railway Declaration of Competing Interest
safety.
The authors declare that they have no known competing financial
7. Conclusions interests or personal relationships that could have appeared to influence
the work reported in this paper.
Focusing on the characteristics of the variable operating environ­
ment and complex risk factors under the background of the railway Data availability
system, a KG oriented to railway safety accidents/incidents is estab­
lished based on text mining technology. By establishing a method that The manuscript states that the data source for this study is the Rail
uses text mining to build a KG and then applied to railway risk assess­ Accident Investigation Branch reports published on the GOV.UK official
ment, the event information with text as the carrier is transformed into a website (please see the first paragraph of Section 4.1).
risk level prediction for safety analysis. The innovative work and con­
clusions are as follows: References

1) The methodology of RKGRS modelling is established. After 17 TA Anctil, F., & Lauzon, N. (2004). Generalisation for neural networks through data
sampling and training procedures, with applications to streamflow predictions.
algorithm-based data enhancement, we construct a corpus (con­ Hydrology and Earth System Sciences, 8, 940–958. https://doi.org/10.5194/hess-8-
taining more than 200,000 rows of named entities and tags) from 940-2004
427 railway accident reports, also optimize different learning algo­ Benchimol, J., Kazinnik, S., & Saadon, Y. (2022). Text mining methodologies with R: An
application to central bank texts. Machine Learning with Applications, 8, Article
rithms and build an ensemble NER model. Moreover, training and 100286. https://doi.org/10.1016/j.mlwa.2022.100286
verification are completed on the pre-processed corpus data set to Bischof, S., & Schenner, G. (2021). Rail topology ontology: A rail infrastructure base
achieve hazard entity mining. The results show that the NER effect ontology. Lecture Notes in Computer Science, 12922, 597–612. https://doi.org/
10.1007/978-3-030-88361-4_35
meets the best performance: precision = 98.12%, recall = 98.30%, Body, T., Tao, X., Li, Y., Li, L., & Zhong, N. (2021). Using back-and-forth translation to
and f1 score = 98.16%, when the word embedding-based TA is create artificial augmented textual data for sentiment analysis models. Expert Systems
combined with the ensemble model. with Applications, 178, Article 115033. https://doi.org/10.1016/j.eswa.2021.115033
Chen, D., Xu, C., & Ni, S. (2017). Data mining on Chinese train accidents to derive
2) Through qualitative analysis of accidents, we define 71 entity-types
associated rules. Proceedings of the Institution of Mechanical Engineers, Part F: Journal
containing 27 incident causes, 33 incident descriptions, and 11 of Rail and Rapid Transit, 231(2), 239–252. https://doi.org/10.1177/
incident consequences. Based on the RF algorithm, standardized 0954409715624724
CNN (2021). In photos: Deadly train derailment in Taiwan. Retrieved from https://edition.
classification of entities is done to achieve knowledge fusion. Then
cnn.com/2021/04/02/world/gallery/taiwan-train-derailment/index.html. Accessed
we establish the visualized structure of RKGRS using Gephi. April 5, 2021.
3) The nodes and links in RKGRS can reflect the significance of the Duan, Y., Shao, L., Hu, G., Zhou, Z., Zou, Q., & Lin, Z. (2017). Specifying architecture of
entity to the risk and the risk evolution law. The constructed knowledge graph with data graph, information graph, knowledge graph and wisdom
graph. In 2017 IEEE/ACIS 15th international conference on software engineering

18
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

research, management and applications (SERA) (pp. 327–332). https://doi.org/ Mannering, F., Bhat, C. R., Shankar, V., & Abdel-Aty, M. (2020). Big data, traditional data
10.1109/SERA.2017.7965747 and the tradeoffs between prediction and causality in highway-safety analysis.
Nettleton, F. D., Salas, J. (2016). A data driven anonymization system for information Analytic Methods in Accident Research, 25, Article 100113. https://doi.org/10.1016/j.
rich online social network graphs. Expert Systems with Applications, 55, 87-105. amar.2020.100113
https://doi.org/10.1016/j.eswa.2016.02.004. Mao, S., Zhao, Y., Chen, J., Wang, B., & Tang, Y. (2020). Development of process safety
Gozuacik, N., Sakar, C., & Ozcan, S. (2021). Social media-based opinion retrieval for knowledge graph: A Case study on delayed coking process. Computers & Chemical
product analysis using multi-task deep neural networks. Expert Systems with Engineering, 143, Article 107094. https://doi.org/10.1016/j.
Applications, 183, Article 115388. https://doi.org/10.1016/j.eswa.2021.115388 compchemeng.2020.107094
Guo, Y., et al. (2016). Deep learning for visual understanding: A review. Neurocomputing, Mor, B., Garhwal, S., & Kumar, A. (2020). A systematic review of hidden markov models
187, 27–48. https://doi.org/10.1016/j.neucom.2015.09.116 and their applications. Archives of Computational Methods in Engineering, 28,
Habibi, M., Weber, L., Neves, M., Wiegandt, D. L., & Leser, U. (2017). Deep learning with 1429–1448. https://doi.org/10.1007/s11831-020-09422-4
word embeddings improves biomedical named entity recognition. Bioinformatics, 33 Pan, Y., Zhang, L., Li, Z., & Ding, L. (2020). Improved fuzzy Bayesian network-based risk
(14), i37–i48. https://doi.org/10.1093/bioinformatics/btx228 analysis with interval-valued fuzzy sets and D-S evidence theory. IEEE Transactions
Halilaj, L., Dindorkar, I., Lüttin, J., & Rothermel, S. (2021). A knowledge graph-based on Fuzzy Systems, 28(9), 2063–2077. https://doi.org/10.1109/TFUZZ.2019.2929024
approach for situation comprehension in driving scenarios. Lecture Notes in Computer Parsa, A. B., Taghipour, H., Derrible, S., & Mohammadian, A. (2019). Real-time accident
Science, 12731, 699–716. https://doi.org/10.1007/978-3-030-77385-4_42 detection: Coping with imbalanced data. Accident Analysis & Prevention, 129,
Hanif, M. K., & Zimmermann, K. H. (2017). Accelerating Viterbi algorithm on graphics 202–210. https://doi.org/10.1016/j.aap.2019.05.014
processing units. Computing, 99(11), 1105–1123. https://doi.org/10.1007/s00607- Perez, I. M., Airola, A., Boström, P. J., Jambor, I., & Pahikkala, T. (2019). Tournament
017-0557-6 leave-pair-out cross-validation for receiver operating characteristic analysis.
He, S., Sun, D., & Wang, Z. (2021). Named entity recognition for Chinese marine text Statistical Methods in Medical Research, 28(10–11), 2975–2991. https://doi.org/
with knowledge-based self-attention. Multimedia Tools and Applications, Early Access.. 10.1177/0962280218795190
https://doi.org/10.1007/s11042-020-10089-z Qiu, L., Cai, F., Li, J., Peng, L., & Zhang, Y. (2018). Tibetan Weibo user group division
Heidarysafa, M., Kowsari, K., Barnes, L., & Brown, D. (2018). Analysis of railway based on user behaviors for analyzing health problems. IEEE Access, 6,
accidents’ narratives using deep learning. In 2018 17th IEEE International Conference 19441–19450. https://doi.org/10.1109/ACCESS.2018.2822767
on Machine Learning and Applications (ICMLA) (pp. 1446–1453). https://doi.org/ Rani, P. S., Suresh, R. M., & Sethukarasi, R. (2019). Multi-level semantic annotation and
10.1109/ICMLA.2018.00235 unified data integration using semantic web ontology in big data processing. Cluster
Hua, L., Zheng, W., & Gao, S. (2019). Extraction and analysis of risk factors from Chinese Computing, 22, 10401–10413. https://doi.org/10.1007/s10586-017-1029-7
railway accident reports. In 2019 IEEE Intelligent Transportation Systems Conference Rinaldi, A. M., Russo, R., & Tommasino, C. (2021). A semantic approach for document
(ITSC) (pp. 869–874). https://doi.org/10.1109/ITSC.2019.8917094 classification using deep neural networks and multimedia knowledge graph. Expert
Hughes, P., Robinson, R., Figueres-Esteban, M., & van Gulijk, C. (2019). Extracting safety Systems with Applications, 169, Article 114320. https://doi.org/10.1016/j.
information from multi-lingual accident reports using an ontology-based approach. eswa.2020.114320
Safety Science, 118, 288–297. https://doi.org/10.1016/j.ssci.2019.05.029 Romijnders, R., Warmerdam, E., Hansen, C., Schmidt, G., & Maetzler, W. (2021).
Jiang, Y., Gao, X., Su, W., & Li, J. (2021). Systematic knowledge management of Validation of IMU-based gait event detection during curved walking and turning in
construction safety standards based on knowledge graphs: A case study in China. older adults and Parkinson’s Disease patients. Journal of NeuroEngineering and
Article 10692 International Journal of Environmental Research and Public Health, 18 Rehabilitation, 18, Article 28. https://doi.org/10.1186/s12984-021-00828-0
(20). https://doi.org/10.3390/ijerph182010692. Ruan, J., Meng, Y., Zhao, F., Gu, H., He, L., Gong, X. (2022). Development of deep
Kejriwal, M., Sequeda, J., & Lopez, V. (2019). Knowledge graphs: Construction, learning-based automatic scan range setting model for lung cancer screening low-
management and querying. Semantic Web, 10(6), 961–962. https://doi.org/ dose CT imaging. Academic Radiology, Available online 5 February 2022. https://doi.
10.3233/SW-190370 org/10.1016/j.acra.2021.12.001.
Krawczyk, B., Minku, L. L., Gama, J., Stefanowski, J., & Wozniak, M. (2017). Ensemble Ruuska, S., et al. (2018). Evaluation of the confusion matrix method in the validation of
learning for data stream analysis: A survey. Information Fusion, 37, 132–156. https:// an automated system for measuring feeding behaviour of cattle. Behavioural
doi.org/10.1016/j.inffus.2017.02.004 Processes, 148, 56–62. https://doi.org/10.1016/j.beproc.2018.01.004
Krishnan, P., & Jawahar, C. V. (2020). Bringing semantics into word image Sakamoto, H. (2008). Fatigue and fracture mechanics in products development for
representation. Pattern Recognition, 108, Article 107542. https://doi.org/10.1016/j. railroad vehicles. In Proceedings of the ASME international mechanical engineering
patcog.2020.107542 congress and exposition 2007 (pp. 433–438).
Kwon, S., Ko, Y., & Seo, J. (2019). Effective vector representation for the Korean named- Sangare, M., Gupta, S., Bouzefrane, S., Banerjee, S., & Muhlethaler, P. (2021). Exploring
entity recognition. Pattern Recognition Letters, 117, 52–57. https://doi.org/10.1016/ the forecasting approach for road accidents: Analytical measures with hybrid
j.patrec.2018.11.019 machine learning. Expert Systems with Applications, 167, Article 113855. https://doi.
Kyriakidis, M. (2013). Developing a human performance railway operational index to org/10.1016/j.eswa.2020.113855
enhance safety of railway operations, Ph.D. Dissertation. UK: Imperial College London. Schuler, M. S., & Rose, S. (2017). Targeted maximum likelihood estimation for causal
Kyriakidis, M., Majumdar, A., & Ochieng, W. Y. (2015). Data based framework to identify inference in observational studies. American Journal of Epidemiology, 185(1), 65–73.
the most significant performance shaping factors in railway operations. Safety https://doi.org/10.1093/aje/kww165
Science, 78, 60–76. https://doi.org/10.1016/j.ssci.2015.04.010 Shultz, J. M., et al. (2016). Disaster complexity and the Santiago de Compostela train
Lam, C. Y., & Tai, K. (2020). Network topological approach to modeling accident derailment. Disaster Health, 3(1), 11–31. https://doi.org/10.1080/
causations and characteristics: Analysis of railway incidents in Japan. Reliability 21665044.2015.1129889
Engineering & System Safety, 193, Article 106626. https://doi.org/10.1016/j. Song, S., Zhang, N., & Huang, H. (2017). Named entity recognition based on conditional
ress.2019.106626 random fields. Cluster Computing, 22, 5195–5206. https://doi.org/10.1007/s10586-
LeBaron, B., & Weigend, A. S. (1998). A bootstrap evaluation of the effect of data 017-1146-3
splitting on financial time series. IEEE Transactions on Neural Networks, 9(1), Takaoka, M., et al. (2007). The action of Amagasaki City Health Center to the train
213–220. https://doi.org/10.1109/72.655043 derailment accident on the Japan Railway Fukuchiyama Line. Japanese journal of
Li, C., Tang, T., Chatzimichailidou, M. M., Jun, G. T., & Waterson, P. (2019). A hybrid public health, 54(5), 324–337.
human and organisational analysis method for railway accidents based on STAMP- Tang, H., Ji, D., & Zhou, Q. (2020). End-to-end masked graph-based CRF for joint slot
HFACS and human information processing. Applied Ergonomics, 79, 122–142. filling and intent detection. Neurocomputing, 413, 348–359. https://doi.org/
https://doi.org/10.1016/j.apergo.2018.12.011 10.1016/j.neucom.2020.06.113
Li, K., & Wang, S. (2018). A network accident causation model for monitoring railway Thirumalai, C., Sree, K. S., & Gannu, H. (2017). Analysis of cost estimation function for
safety. Safety Science, 109, 398–402. https://doi.org/10.1016/j.ssci.2018.06.008 Facebook web click data. In 2017 international conference of electronics,
Liu, C., Yang, S., Cui, Y., & Yang, Y. (2020). An improved risk assessment method based communication and aerospace technology (ICECA), vol 2 (pp. 172–175). https://doi.
on a comprehensive weighting algorithm in railway signaling safety analysis. Safety org/10.1109/ICECA.2017.8212788
Science, 128, Article 104768. https://doi.org/10.1016/j.ssci.2020.104768 Tibrewala, R., Ozhinsky, E., Shah, R., Flament, I., Crossley, K., Srinivasan, R., et al.
Liu, J. T., & Li, K. P. (2018). A cascading failure model for analyzing railway accident (2020). Computer-aided detection AI reduces interreader variability in grading hip
causation. Article 1750265 International Journal of Modern Physics B, 32(1). https:// abnormalities with MRI. Journal of Magnetic Resonance Imaging, 52, 1163–1172.
doi.org/10.1142/S0217979217502654. https://doi.org/10.1002/jmri.27164
Liu, J., Schmid, F., Zheng, W., & Zhu, J. (2019). Understanding railway operational Wang, J. F., Wang, J. G., Roberts, C., Chen, L., & Zhang, Y. (2017). A novel train control
accidents using network theory. Reliability Engineering & System Safety, 189, approach to avoid rear-end collision based on geese migration principle. Safety
218–231. https://doi.org/10.1016/j.ress.2019.04.030 Science, 91, 373–380. https://doi.org/10.1016/j.ssci.2016.08.025
Lu, R., et al. (2020). HAPE: A programmable big knowledge graph platform. Information Wang, X., Qu, Z., Song, X., Bai, Q., Pan, Z., & Li, H. (2021). Incorporating accident
Science, 509, 87–103. https://doi.org/10.1016/j.ins.2019.08.051 liability into crash risk analysis: A multidimensional risk source approach. Accident
Luo, W., Cai, F., Wu, C., & Meng, X. (2021). Bayesian network-based knowledge graph Analysis & Prevention, 153, Article 106035. https://doi.org/10.1016/j.
inference for highway transportation safety risks. Advances in Civil Engineering, 2021, aap.2021.106035
Article 6624579. https://doi.org/10.1155/2021/6624579. W. Wu J. May R., R. Maier, H., & C. Dandy, G. A benchmarking approach for comparing
Lyu, H. M., Zhou, W. H., Shen, S. L., & Zhou, A. N. (2020). Inundation risk assessment of data splitting methods for modeling water resources parameters using artificial
metro system using AHP and TFN-AHP in Shenzhen. Sustainable Cities and Society, neural networks Water Resources Research 49 11 2013 7598 7614 10.1002/
56, Article 102103. https://doi.org/10.1016/j.scs.2020.102103 2012WR012713.
Mannering, F. (2018). Temporal instability and the analysis of highway accident data. Wu, S., et al. (2020). Deep learning in clinical natural language processing: A methodical
Analytic Methods in Accident Research, 17, 1–13. https://doi.org/10.1016/j. review. Journal of the American Medical Informatics Association, 27(3), 457–470.
amar.2017.10.002 https://doi.org/10.1093/jamia/ocz200

19
C. Liu and S. Yang Expert Systems With Applications 207 (2022) 117991

Wu, T., Qi, G., Li, C., & Wang, M. (2018). A survey of techniques for constructing Chinese Zhang, Q., Lin, M., Jun, J., & Zhang, X. (2017). Research on text mining algorithm based
knowledge graphs and their applications. Article 3245 Sustainability, 10(9). https:// on focused crawler. 2017 12th international conference on computer science and
doi.org/10.3390/su10093245. education (ICCSE 2017) (pp. 454-457). August 22-25, 2017. https://doi.org/
Yazdi, M., Korhan, O., & Daneshvar, S. (2020). Application of fuzzy fault tree analysis 10.1109/ICCSE.2017.8085535.
based on modified fuzzy AHP and fuzzy TOPSIS for fire and explosion in the process Zhao, Y., Xu, T., & Wang, H. (2014). Text mining based fault diagnosis of vehicle on-
industry. International Journal of Occupational Safety and Ergonomics, 26(2), 319–335. board equipment for high speed railway. In 17th International IEEE Conference on
https://doi.org/10.1080/10803548.2018.1454636 Intelligent Transportation Systems (ITSC) (pp. 900–905). https://doi.org/10.1109/
Yoo, S., & Jeong, O. (2020). Automating the expansion of a knowledge graph. Expert ITSC.2014.6957803
Systems with Applications, 141, Article 112965. https://doi.org/10.1016/j. Zhou, J. L. & Lei, Y. (2018). Paths between latent and active errors: Analysis of 407
eswa.2019.112965 railway accidents/incidents’ causes in China. Safety Science, 110(Part B), 47-58.
Zhang, Q., et al. (2019). Construction of knowledge graphs for maritime dangerous https://doi.org/10.1016/j.ssci.2017.12.027.
goods. Article 2849 Sustainability, 11(10). https://doi.org/10.3390/su11102849. Zhou, Y., & Qiu, G. (2018). Random forest for label ranking. Expert Systems with
Applications, 112, 99–109. https://doi.org/10.1016/j.eswa.2018.06.036

20

You might also like