0% found this document useful (0 votes)
119 views

Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research

This document provides a summary of a survey on applications of explainable artificial intelligence (XAI) in cyber security. The survey aims to comprehensively review state-of-the-art XAI approaches that can be applied to issues in the cyber security field. While AI techniques have been used for cyber security applications, most are "black-box" models that lack transparency and interpretability. The survey discusses the need for more explainable models to improve user trust and management of cyber defense systems. It provides an overview of XAI techniques, challenges, frameworks and datasets for developing XAI-based cyber security models. The survey is the first to map the XAI literature specifically to applications in cyber security.

Uploaded by

Khalid Syfullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views

Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research

This document provides a summary of a survey on applications of explainable artificial intelligence (XAI) in cyber security. The survey aims to comprehensively review state-of-the-art XAI approaches that can be applied to issues in the cyber security field. While AI techniques have been used for cyber security applications, most are "black-box" models that lack transparency and interpretability. The survey discusses the need for more explainable models to improve user trust and management of cyber defense systems. It provides an overview of XAI techniques, challenges, frameworks and datasets for developing XAI-based cyber security models. The survey is the first to map the XAI literature specifically to applications in cyber security.

Uploaded by

Khalid Syfullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Explainable Artificial Intelligence

Applications in Cyber Security: State-of-


the-Art in Research
ZHIBO ZHANG1,2, HUSSAM AL HAMADI1,2, (Senior Member, IEEE), ERNESTO DAMIANI1,2,
(Senior Member, IEEE), CHAN YEOB YEUN1,2, (Senior Member, IEEE), and FATMA TAHER3,
(Senior Member, IEEE)
1
Center for Cyber-Physical Systems, Khalifa University, Abu Dhabi, United Arab Emirates
2
Department of Electrical Engineering and Computer Science, Khalifa University, Abu Dhabi, United Arab Emirates
3
College of Technological Innovation, Zayed University, Dubai, United Arab Emirates
Corresponding author: Zhibo Zhang (e-mail: [email protected]).

ABSTRACT This survey presents a comprehensive review of current literature on Explainable Artificial
Intelligence (XAI) methods for cyber security applications. Due to the rapid development of Internet-
connected systems and Artificial Intelligence in recent years, Artificial Intelligence including Machine
Learning (ML) and Deep Learning (DL) has been widely utilized in the fields of cyber security including
intrusion detection, malware detection, and spam filtering. However, although Artificial Intelligence-based
approaches for the detection and defense of cyber attacks and threats are more advanced and efficient
compared to the conventional signature-based and rule-based cyber security strategies, most ML-based
techniques and DL-based techniques are deployed in the ‘‘black-box’’ manner, meaning that security
experts and customers are unable to explain how such procedures reach particular conclusions. The
deficiencies of transparencies and interpretability of existing Artificial Intelligence techniques would
decrease human users’ confidence in the models utilized for the defense against cyber attacks, especially in
current situations where cyber attacks become increasingly diverse and complicated. Therefore, it is
essential to apply XAI in the establishment of cyber security models to create more explainable models
while maintaining high accuracy and allowing human users to comprehend, trust, and manage the next
generation of cyber defense mechanisms. Although there are papers reviewing Artificial Intelligence
applications in cyber security areas and the vast literature on applying XAI in many fields including
healthcare, financial services, and criminal justice, the surprising fact is that there are currently no survey
research articles that concentrate on XAI applications in cyber security. Therefore, the motivation behind
the survey is to bridge the research gap by presenting a detailed and up-to-date survey of XAI approaches
applicable to issues in the cyber security field. Our work is the first to propose a clear roadmap for
navigating the XAI literature in the context of applications in cyber security.

INDEX TERMS Artificial intelligence, cyber security, deep learning, explanation artificial intelligence,
intrusion detection, machine learning, malware detection, spam filtering.

I. INTRODUCTION Internet also tempts cyber attackers to develop more


sophisticated and powerful cyber-attack methods for their
Cyber Security is the practice of securing networks, devices, benefit. It is noticeable that with the number of internet users
and data against unauthorized access or illegal usage, as worldwide increasing by 0.3 billion in 2021 compared with
well as the art of maintaining information confidentiality, the previous year [3], global cyber attacks increased by 29%
integrity, and availability [1], whereas cyber defensive in 2021 according to the 2021 Cyber Trends Report [4]. In
mechanisms emerge at the application, network, host, and June of 2022, a cyberattack on a software business caused
data levels [2]. As the Internet has become an essential tool thousands of individuals in multiple states of the USA to lose
in everyone's daily life, the number of systems linked to the their unemployment benefits and job-search help [5], which
Internet grows as well. The advancement of computer will lead to severe social instability during the COVID-19
networks, servers, and mobile devices has significantly pandemic. As a matter of fact, according to the report by the
boosted Internet usage. However, the wide utilization of the European Union Agency for Network and Information
Security (ENISA) [6], safe and trustworthy cyberspace is Implementing Artificial Intelligence in applications of cyber
expected to become even more crucial in the new social and security has been researched in recent years and many
economic norms formed by the COVID-19 epidemic. These previous surveys reviewed the existing work in this field. On
figures and events demonstrate the serious facts that the the other hand, the trends of applying XAI to provide more
Internet and connected networks and devices have suffered explainable and transparent services for areas including
more cybercriminals and cyber attacks nowadays. healthcare and image analysis are popular in research as well.
Therefore, a stable and secure cyber security computer However, to the best of our knowledge, although there are
system must be established to ensure the information privacy, some other excellent survey papers available on the topics of
accessibility, and integrity transmitted within the Internet. XAI and cyber security independently, there is a lack of a
Nevertheless, the conventional signature-based and rule- comprehensive survey paper focusing on the review of
based cyber defensive mechanisms are facing challenges solutions based on XAI across a wide variety of cyber
within the increasing quantities of information spread over security applications. This survey also concludes with special
the Internet [7]. On the other hand, cyber hackers are always deep analytical insights based on their opinions. These
striving to keep one step ahead of law enforcement by findings reveal several holes that may be filled using XAI
generating new, smart, and intricate attacking techniques and methods, indicating the overall future direction of research in
implementing technological advances including Artificial this domain.
Intelligence to make their adversarial behaviors more In general, this survey intends to provide a comprehensive
sophisticated and efficient [8]. As a consequence, researchers review of state-of-art XAI applications in the cyber security
in cyber security have begun to investigate Artificial area. The research motivations behind this work are listed as
Intelligence-based approaches especially ML and DL rather followings:
than traditional (non-AI) cybersecurity techniques including (1) To review different techniques and categorizations of
Game theory, Rate Control, and Autonomous systems to XAI.
enhance the performance of cyber defensive systems. (2) To review existing challenges and problems of XAI.
Although Artificial Intelligence techniques, especially ML (3) To identify the frameworks and available datasets for the
and DL algorithms could provide impressive performances XAI-based cyber defensive mechanism.
on benchmark datasets in a number of cyber security domain (4) To review the latest successful XAI-based systems and
applications such as Intrusion detection, spam e-mail filtering, applications in the cyber security domain.
Botnet detection, fraud detection, and malicious application (5) To identify challenges and research gaps of XAI
identification [9], they can commit errors, some of which are applications in cyber security.
more expensive than conventional cyber defensive (6) To identify the key insights and future research
approaches. On the other hand, cyber security developers directions for applying XAI in the cyber security area.
have sometimes sought higher accuracy at the price of
interpretability, making their models more intricate and B. PREVIOUS SURVEYS
difficult to grasp [10]. This lack of explainability has been XAI and cyber security have been reviewed mostly
disclosed by the European Union’s General Data Protection separately in previous surveys. However, crossovers have
Regulation, preserving the capacity to comprehend the logic emerged between the two domains. This survey presented a
behind an Artificial Intelligence algorithmic decision that comprehensive introduction of different XAI techniques
negatively impacts individuals [11]. Accordingly, to be able applied in cyber defensive systems. Our work also provided
to believe the decisions of cyber security systems, Artificial comprehensive XAI categorizations and analyzed details
Intelligence must be transparent and interpretable. To satisfy about the existing challenges and frameworks of XAI for
these kinds of demands, several strategies have been cyber security. Cyber security datasets available for XAI
proposed to make Artificial Intelligence decisions more models and the cyber threats faced by XAI models are
intelligible to humans. And these explainable techniques are discussed in this paper as well. Table 1 contrasts our study
usually shortened as “XAI”, which have already been with currently available surveys and reviewing articles.
implemented in many application domains such as healthcare, Many existing surveys only analyzed Artificial Intelligence
Natural Language Processing, and financial services [12]. (AI) applications, either ML or DL, in the cyber security
And the objective of this research paper is to focus on the area, whereas other authors review XAI methods for a
applications of XAI in different fields in the context of cyber narrow set of cyber security applications. Some reviewers
security. could not describe the background of XAI and cyber security
in detail. Furthermore, most articles discuss
A. RESEARCH MOTIVATION
TABLE 1. Comparison of existing surveys with our work (legend: √ means included; N/A means not included; ≈ means partially included)

Survey Reference Survey Key insights


number number year XAI Cyber security
and future
XAI XAI ML DL XAI XAI Cyber Cyber Industrial Adversarial
Categorization Framework Evaluation Challenges directions
security attacks applications threats on
datasets XAI
1 [13] 2016
N/A N/A √ N/A N/A N/A N/A √ ≈ N/A ≈
2 [14] 2016
N/A N/A √ √ N/A N/A √ ≈ ≈ ≈ √
3 [15] 2017
N/A N/A N/A √ N/A N/A √ √ ≈ N/A √
4 [16] 2018
N/A N/A √ ≈ N/A N/A √ √ ≈ N/A √
5 [17] 2018
N/A N/A √ ≈ N/A N/A √ √ ≈ N/A √
6 [18] 2019
N/A N/A √ ≈ N/A N/A √ √ ≈ N/A √
7 [19] 2019
√ √ ≈ ≈ √ √ N/A N/A √ N/A √
8 [20] 2020
≈ ≈ N/A N/A N/A N/A √ N/A √ N/A √
9 [7] 2021
N/A N/A √ N/A N/A N/A √ √ ≈ N/A √
10 [21] 2018
N/A N/A √ √ N/A ≈ ≈ √ ≈ N/A √
11 [22] 2018
N/A N/A √ √ N/A N/A √ ≈ √ ≈ √
12 [23] 2018
N/A N/A √ √ N/A N/A √ √ ≈ ≈ ≈
13 [24] 2018
N/A N/A √ √ N/A N/A √ √ ≈ √ √
14 [25] 2022
√ √ √ √ ≈ √ N/A N/A √ N/A √
15 [26] 2021
N/A √ √ N/A ≈ N/A √ √ N/A √ ≈
16 [27] 2021
√ √ ≈ ≈ N/A √ √ N/A N/A N/A √
17 [28] 2019
N/A N/A √ √ N/A N/A √ √ ≈ ≈ √
18 [29] 2019
√ √ N/A √ ≈ √ √ √ N/A N/A √
19 [2] 2019
√ √ ≈ √ N/A √ √ √ N/A N/A √
20 [9] 2019
√ √ ≈ ≈ ≈ √ N/A √ ≈ N/A √
21 [30] 2022
√ N/A ≈ ≈ ≈ √ ≈ √ √ √ √
22 [31] 2021
≈ ≈ √ √ N/A N/A √ √ ≈ √ √
23 [32] 2020
N/A N/A ≈ ≈ N/A N/A √ √ √ √ ≈
24 [33] 2020
√ √ √ N/A √ N/A N/A √ N/A ≈ √
25 [34] 2021
≈ ≈ ≈ N/A ≈ N/A √ √ √ ≈ √
26 [10] 2021
√ √ √ √ ≈ ≈ N/A N/A N/A N/A ≈
27 [12] 2021
√ ≈ √ √ N/A ≈ N/A N/A ≈ N/A √
28 [35] 2022
√ √ N/A N/A N/A √ ≈ ≈ √ N/A √
29 [36] 2022
√ √ N/A ≈ ≈ N/A √ √ N/A √ √

Our Paper √ √ √ √ √ √ √ √ √ √ √
30 2022
FIGURE 1. Structure of this paper.
only AI applications in cyber security or XAI implemented in expansion. As a result, by 2026, the worldwide
other domains rather than focusing on cyber security. cybersecurity sector is anticipated to be worth 345.4
From Table 1, it is obvious that this survey is billion USD [39]. On the other hand, besides the
comprehensive and distinct in including the following conventional cyber attacks including malware, botnet, and
features in comparison to previously published survey spam, adversarial cyber security threats specifically
research in the field: summarizing commonly used cyber targeting AI models are Gradually emerging in recent
security datasets available, discussing popular XAI tools and years as well [24]. Therefore, the scope for the domain of
their applications in the cyber security area, analyzing the cyber security analyzed in this survey paper will be
XAI applications in defending different categories of cyber constituted in the following 3 sub-fields in conjunction
attacks, providing assessment measures for evaluating the with XAI:
performance of XAI models, giving descriptions on the 1) Different categories of the most prominent cyber
adversarial cyber attacks which XAI itself may suffer, and attacks including malware, Botnet, spam, fraud,
pointing out some key insights about applying XAI for cyber phishing, Cyber Physical Systems (CPSs) attacks,
security. network intrusion, Denial-of-service (DoS) attacks,
Man-in-the-middle (MITM) attacks, Domain
C. SCOPE OF CYBER SECURITY ANALYSED Generation Algorithms (DGAs), and Structured
In agreement with the International Organization for Query Language (SQL) injection attacks are
Standardization (ISO/IEC 27032) [37], cyber security is described in detail respectively. By doing so, the
defined as the privacy, integrity, and availability of terminologies of cyber attacks are clear and the
internet data. Cyber attacks are cybercriminal attacks defensive systems against these attacks are
undertaken using one or more computers against a single discussed in this paper as well.
or numerous computers or networks. A cyber assault can 2) Cyber security implementation in different
purposefully destroy systems, steal data, or utilize a industrial areas including smart grid, healthcare,
compromised computer as a launch pad for more attacks smart agriculture, smart transportation, Human-
[38]. Due to the wide spreading of cyber attacks and Computer Interaction(HCI), and smart financial
threats, the cyber security industries are seeing rapid
FIGURE 2. Research methodology flow chart.
system will be reviewed in this survey. This paper usage of XAI applications in cyber security.
provides a brief introduction of XAI for cyber 3) We discuss different categories of defensive
security in each domain respectively. applications of XAI against cyber attacks
3) While XAI is implemented in many different respectively, and we highlight the advantages and
scenarios to defend against cyber threats, XAI limitations to develop XAI-based cyber-defense
models will face adversarial attacks targeting XAI systems.
models as well. This survey will investigate cyber 4) We justify XAI for cyber security in different
security from this perspective as well. Adversarial industry scenarios.
threats targeting XAI, defense approaches against 5) We illustrate Adversarial cyber threats pointing to
these attacks, and the establishment of secure XAI XAI models are described whereas the defense
cyber systems will be interpreted respectively. approaches against these attacks.
6) We outline the outstanding issues and existing
D. CONTRIBUTIONS
challenges associated with the intersection of XAI
and cyber security, and we identify the key insights
This study extensively evaluates current breakthroughs and and future research directions for the XAI
state-of-the-art XAI-based solutions in a wide variety of applications in cyber security.
cyber security applications and cyber attack defensive
mechanisms to address the gaps and shortcomings mentioned
E. STRUCTURE OF THIS SURVEY
in earlier surveys. There is no previous survey available
As shown in Fig 1, this survey has been organized in such a
analyzing the state-of-art XAI applications in cyber security
systemically from the perspectives of both cyber attack way that the background information for the research being
defensive schemes and industrial applications. Our research's examined comes first. Section II introduces the methodology
contributions can be summarized in the following points: of research on this survey in the field of XAI applications in
cyber security. Section III discusses the general background
1) We rationalize the motivations for integrating XAI
of XAI, motivations, categorizations, and challenges of XAI
in AI-based cyber security models whereas the
are justified in this section. The section after that (Section IV)
basic background on XAI is presented.
is organized based on the XAI framework and available
2) We provide a thorough summary as well as a quick datasets for cyber security. Section V will be devoted to a
overview of the datasets that are accessible for the
comprehensive discussion of XAI applications in cyber research state-of-art in the areas of XAI applications in cyber
security from different perspectives. The existing challenges, security. Therefore, to collect the research articles reviewed,
key insights, and future directions of this area are highlighted the following criteria were established:
in Section VI, which is followed by the conclusion. And the 1) A thorough search was carried out whereas
conclusion would be the last section, which is Section VII. different academic search engines illustrated in
Table 2 were utilized to collect the relevant
TABLE 2. Research searching database engines.
papers.
2) The searching keywords for this survey paper were
Searching Engines Database Address constituted as 2 aspects: “XAI” and “Cyber
Security”. To create the search string, all potential
Springer https://link.springer.com/
Taylor & Francis https://taylorandfrancis.com/ pertinent synonyms of the given terms were
Semantic Scholar https://www.semanticscholar.org/ discovered in different databases and the percentage
ACM Digital Library https://dl.acm.org/ of reviewed papers from sources was depicted in
ResearchGate https://www.researchgate.net/
Figure 3. The following synonyms may be pertinent
Google Scholar https://scholar.google.com/
IEEE Xplore https://ieeexplore.ieee.org to the subject: “Cyber Security”, “Cyber Physical”,
Elsevier https://www.elsevier.com/ “Cyber Attack”, “Cyber Threat”, Network Security”,
Research Rabbit https://researchrabbitapp.com/ “Cyber Crime”, “XAI”, “Explainable Artificial
Intelligence”, “Interpretable Artificial Intelligence”,
“Explainable ML (XML)”, and “Transparent
Artificial Intelligence”.
3) Only researches published between 2011 and 2022
were selected to report on the most recent trends in
the application of XAI techniques in cyber security
for this research. Besides, papers published after
2017 were given higher attention and occupied a
large proportion of all reviewed publications, as
shown in Figure 4.
4) Only publications written in the English language
were included in this review and duplicated studies
were excluded.
5) Only papers objecting to cyber security vulnerability
domains were reviewed in this survey paper whereas
FIGURE 3. Percentage of Reviewed Papers from Sources.
researches proposing ML-based systems, DL-based
systems, XAI-based mechanisms, and AI-based
mechanisms would be extracted.
The procedure of choosing articles was instantaneous and
consisted of two steps: firstly, the searching results were
initially chosen based on the selection criteria by scanning
the publications' titles and abstracts; secondly, the documents
chosen in the initial phase were thoroughly read to create a
shortlist of articles published that would be chosen based on
the inclusion and exclusion criteria.

III. XAI BACKGROUND


As we introduced in Section I, the concept of XAI is defined
as the technique to improve the human understanding of how
AI makes decisions [10]. In this section, we will review the
general background of XAI, providing some necessary prior
knowledge for readers to have a better understanding in the
following sections introducing the XAI applications in cyber
FIGURE 4. Percentage of Papers included from 2011 to 2022.
security.

II. METHODOLOGY OF RESEARCH


The research methodology flow chart of this survey is
described in Figure 2. As we mentioned in Section I
Introduction, the goal of this study was to investigate the
these traditional approaches have a low capacity to process
massive amounts of data and high computing costs [7].
On the other hand, Artificial intelligence works as one of
the foundational technologies of Industry 4.0 [31]. Therefore,
AI techniques including ML algorithms and DL algorithms
can play a significant part in the provision of intelligent cyber
security services and management in recent years. For
instance, Daniele et al. [17] concluded the implementation of
ML Methods for malware analysis including malware
detection, malware similarity analysis, and malware category
analysis. And Donghwoon et al. [15] utilized DL-based
FIGURE 5 A Venn Diagram showing the connections between words approaches to network anomaly detection and network traffic
used frequently in the XAI domain. analysis.
Nevertheless, due to the limitations of the AI-based
Before exploring the XAI background deeply, it is worth approaches, the applications of AI in the cyber security area
mentioning and clarifying the terminologies in the XAI are facing challenges as well. For instance, the access to
domain. Numerous concepts and phrases, which include cybersecurity-related data [45], adversarial attacks on AI
intelligibility, explainability, transparency, and models [46], and Ethics and Privacy issues [47] are typical
interpretability. have been used to characterize XAI recently inherent limitations suffered by AI-based cyber security
[40]. And the relationships between these terms are shown in systems. Among these drawbacks, the black-box nature of AI
Figure 5. Among these terms, interpretability is defined as a models is a severe limitation that we should pay more
concept similar to explainability [41]. However, in recent attention to when AI models are integrated into the cyber
years, the terminology for the term “interpretability” has security domain [48]. Because of AI models’ black-box
shifted to information extraction rather than providing characteristics, the cybersecurity-related decisions generated
explanations [42], meaning that the terms of interpretability by AI-based models lack rationale and justifiability of their
and explainability are becoming more diverse while still decisions and therefore are difficult for people to understand
intersecting with each other. Therefore, in this study, we how these results are produced [49]. In this case, the cyber
focus on the side of “explainability” in XAI whereas the defensive mechanisms would become black-box systems that
reviewed papers focusing on “intelligibility”, “transparency”, are extremely vulnerable to information breaches and AI-
and “intelligibility” parts would be extracted and excluded based cyber threats [50].
according to their clutters with the concept of Therefore, to deal with the drawbacks of utilizing AI for
“explainability”. cyber security, XAI is a reaction that emerged to the growing
In the following subsections of this section, we will black box issue with AI. Users and specialists can understand
introduce the background of XAI from different perspectives the logical explanation and main data evidence due to XAI's
respectively, including the motivations to integrate XAI into contribution of interoperability to the results produced by the
cyber security, categorizations of XAI, and existing AI-based statistical models [19].
challenges of XAI. The purpose of this section is to provide To conclude, the motivations to apply XAI to cyber
readers with a general description of the XAI area so that security are given as followings:
readers could have a deeper understanding of the parts of
1) Building trust is a key object for integrating XAI
XAI applications in cyber security.
which is closely related to transparency and
understanding of cybersecurity-related decision
models.
A. MOTIVATIONS TO INTEGRATE XAI INTO CYBER
SECURITY 2) Another motivation to apply XAI in the cyber
Given the constant growth in complexity and volume of security area is to comply with many new
cyber attacks including malware, intrusion, and spam, coping regulations and General Data Protection Regulation
with them is becoming increasingly difficult [17]. According (GDPR) laws [51] calling for providing
to [43], conventional algorithms including rule-based explanations to the entire society in various fields
algorithms, statistics-based algorithms, and signature-based including cyber security.
approaches are utilized to detect intrusions in the cyber 3) Justice, social responsibility, and risk mitigation are
security area. However, due to the growing amount of data significant concerns for applying XAI in cyber
being communicated over the Internet and the emergency of security because protecting cyber security may be
the new networking paradigms including the Internet of dealing with serious social problems, sometimes
Things (IoT), cloud computing, and fog/edge computing [44], even human lives, and not just cost-benefit
calculations.
4) Cyber security system biases and the work by analyzing feature inputs and outputs and do not have
misunderstanding of their effectiveness have access to the models’ internal information, such as weights or
emerged as key drivers for XAI. For instance, structural information by definition. Shapley Additive
biased training data occurs as a problem that affects Explanations (SHAP) tools [59], Saliency Map [60], and
the model's output's credibility, in particular when Gradient-weighted Class Activation Mapping (Grad-CAM)
working with neural networks that learn patterns [61] are widely used model-agnostic explanation tools.
from training data [52]. 3) LOCAL OR GLOBAL
5) Ability to provide obliged and decent justification Explanations of the decision models can be divided as local
for the cyber security system. By doing so, the or global depending on the model's scope. Local
created cyber security defensive mechanisms can explainability describes a system's capacity to show a user
not only be fair and socially responsible for the why a particular choice or decision was made. Some popular
decisions, but also defend their results with explainability methods such as LIME [56], SHAP [59], and
justifications. counterfactual explanations [62] can be filed under this
category. Local explainability methods are emphasized as the
B. CATEGORIZATIONS OF XAI first crucial component of model transparency [55]. In the
According to [53], [54], the XAI categories can be structured contrast, global explainability refers to the explanation of the
in a variety of aspects shown in Figure 6. It is noticeable that learning algorithm as a whole, taking into account the
the categorization methods are not ideal, meaning that training data utilized, the algorithms' proper applications, and
overlapping may happen and one specific XAI technique can any cautions regarding the algorithm's flaws and improper
be categorized into one or more aspects. Therefore, it would applications. Global Attribution Mapping (GAM) is
be more precise and concrete if we categorized one XAI proposed in [63] as a global explaination approach to explain
technique from different categorization perspectives. By the landscape of neural network predictions across
doing so, more information and characteristics of this XAI subpopulations.
approach could be revealed at different levels. 4) EXPLANATION OUTPUT
1) INTRINSIC OR POST-HOC The explanation output is also a crucial component of XAI
This categorization method distinguishes between achieving categorization for the reason that the format of the
explainability by limiting the complexity of the AI model explanation output would have a strong influence on certain
(intrinsic) or by analyzing the methodology of the model users. For instance, text-based explanation methods are
after training (Post-hoc) to differentiate whether widely utilized in the field of Natural Language Processing
explainability is achieved. An intrinsic XAI approach (NLP) to fine-grained information and generate human-
produces the explanation concurrently with the forecast by readable explanations [64]. On the hand, the visualized
using data that the model emits as a result of the prediction- explanation approaches are used in vaster domains including
making process [55]. Some ML models, including Decision NLP [65], neural networks [66], and healthcare [67]. In fact,
Trees and Sparse Linear models, are regarded as intrinsic the majority of feature summary statistics can also be
XAI approaches because they are self-explained. On the visualized and some feature summaries are only meaningful
other hand, Post-hoc explanations are the utilization of when visualized [68]. Arguments-based explanations involve
interpretation methods after the models have been trained outlining the features in a way that humans use to come to
and the decisions have already been made. Local decisions to help humans to better understand the relevance
Interpretable Model-agnostic Explanations (LIME) [56] and of a feature [69]. Model-based explanation approaches need
Permutation Importance [57] are typical Post-hoc to outline the internal working logic of a black-box model.
explanation methods working independently as an external And this is often accomplished by approximating the black-
interpretable model. box model behavior with a different model that is more
2) MODEL-SPECIFIC OR MODEL-AGNOSTIC interpretable and transparent [10]. For instance, Wu et al. [70]
XAI methods can also be classified according to the classes proposed a model-specific technique aiming to reduce the
of models to that XAI methods could be applied, which are complexity of the Deep Neural Network (DNN) model by
model-specific or model-agnostic. Model-specific introducing a model complexity penalty function. And
explanation tools are specific to a single model or group of Lakkaraju et al. [71] proposed a model-agnostic technique
models. For instance, the graph neural network explainer [58] called Model Understanding through Subspace Explanations
is a method for presenting comprehensible justifications for (MUSE), aiming at learning the behavious of a specific
any GNN-based model's predictions on any graph-based ML black-box model by yielding a small number of tight decision
problem. On the contrary, model-agnostic explanation tools sets.
can be implemented with any ML model in theory.
Furthermore, model-agnostic explanation methods usually
FIGURE 6 An overview diagram showing the categorization of XAI in different aspects.
demonstrating that the extremely biased (racist) classifiers
crafted can easily fool these popular explanation techniques.
C. EXISTING CHALLENGES OF XAI Besides, for the specific Deep Neural Network (DNN)
Despite the fact that the research community has regarded models, Cleverhans et al. [76] looked for adversarial
XAI as a solution to the issues with the trust and dependency vulnerabilities DeepFool tool and offered several methods to
posed by conventional black-box AI-based systems, XAI is harden the model against it.
still facing challenges from different perspectives. 2) XAI PERFORMANCE EVALUATION
Challenges related to XAI security, XAI performance The effectiveness of an XAI method could be evaluated and
evaluation, legal and privacy issues, and the trade-off measured in a variety of ways. However, there is no accepted
between interpretability and accuracy. In Table 3, a summary system available for determining if an XAI system is more
of challenges related to these challenges of XAI is provided. user-intelligent than another XAI system at this time [77].
1) XAI SECURITY In papers [78] and [79], strong concerns were proposed
Some frequently deployed XAI models are susceptible to about choosing the best technique for explainability requires
adversarial attacks, which raises the public’s concern about a well-established evaluation system for explainability.
the security of XAI [72]. For the evaluation of the explanations given by post-hoc
Guo in [73] highlighted the necessity to develop defense XAI approaches on tabular data, Julian et al. [80] proposed a
mechanisms that can recognize targeted attacks against XAI definition of feature relevance in Boolean functions and a
engines, especially for the reason that building and testing environment by creating fictitious datasets. And in
quantifying trust between human end-users is essential for paper [81], Leila et al. solved the issue of the absence of a
6G to enable higher levels of safety-critical autonomy across heatmap quality measurement that is both impartial and
a variety of industries. And Fatima et al. [74] also pointed widely acknowledged by presenting a framework for
out that it would be fascinating to look into the adversarial evaluating XAI algorithms using ground truth based on the
ML and Deep models (or the application of ML and DL in CLEVR visual question answering task.
adversarial circumstances) in XAI and highlighted the three
main factors that enable the security of AI models are the
changes in the input data used by learning models, bias, and
fairness.
Slack et al. [75] made criticism about some post-hoc
explanation methods such as LIME and SHAP by
TABLE 3. Summary of XAI challenges. architecture. In Recital 71, the word ‘ ‘ explanation’’ is
mentioned, outlining the human right to contest the decision
Challenges Reference Descriptions made following such an evaluation and to get an explanation
[73] The necessity to develop defense mechanisms of the decision. Furthermore, Martin [85] investigated
against attacks especially for building 6G whether and to what degree people have a legal right to an
industries.
[74] The application of ML and DL in adversarial
explanation of automated decision-making under EU law,
XAI security circumstances. Be aware of the input data. particularly when AI systems are involved.
[75] Criticized some post-hoc explanation 4) THE TRADE-OFF BETWEEN INTERPRETABILITY
methods such as LIME and SHAP by fooling AND ACCURACY
these techniques.
[76] Discussed the DeepFool tool targeting DNN
The Explainability and performance (predictive accuracy) of
models and offered several methods against a model are generally shown to be in trading-off with each
it. other [90]. In fact, there is a demand for explainable models
[77] Outlined the fact that there is no accepted that can attain high performance because the algorithms that
system for determining the XAI system’s
priority.
currently perform the best are frequently the least explainable
[78] Proposed strong concerns about choosing the (for example, DL) [53].
XAI performance best technique for explainability Despite simple models being frequently favored for their
evaluation [80] Proposed a definition of feature relevance in ease of explaining [91], these models’ explainability may be
Boolean functions and a testing environment
compromised in cases when highly engineered or heavy
[81] Presented a framework for evaluating XAI
algorithms based on the CLEVR visual dimensional features are used [86].
question answering task. Amann et al. [87] adopted a multidisciplinary approach to
[82] Proposed concerns about the role of XAI in analyze the relevance of explainability for medical AI from
marketing AI applications. different perspectives, showing the necessity to apply XAI in
[83] The European Commission (EC) has also
published ethical guidelines for Trustworthy
clinical practice even though the primary objective is to give
Legal and privacy AI and highlighted privacy. patients the finest care possible [88].
issues [84] GDPR of the EU outlined the human right to
contest the decision made and got an IV. XAI FRAMEWORK AND DATASETS FOR CYBER
explanation of the decision. SECURITY
[85] Discussed what degree people have a legal
right to an explanation of automated decision-
making under EU law A. XAI FRAMEWORK FOR CYBER SECURITY
[53] Outlined the fact that the algorithms that In this section, based on the publications we have carefully
The trade-off currently perform the best are frequently the read in this survey, we provide a general XAI framework
between least explainable such as DL. diagram for cyber security applications. And the conceptual
interpretability [86] Pointed out that models’ explainability may
and accuracy be compromised in cases when highly
framework diagram for XAI applications in cyber security is
engineered or heavy dimensional features are illustrated in Figure 7. This diagram is considered to be as
used general as it can be to show the processes of applying XAI in
[87] Adopted a multidisciplinary approach to the cyber area domains. There are several stages in this
analyze the relevance of explainability for
medical AI from different perspectives workflow whereas certain sample instances are presented in
[88] Argued the necessity to apply XAI in clinical each stage.
practice The framework workflow starts by determining the types
of cyber security tasks, including malware detection, spam
3) LEGAL AND PRIVACY ISSUES detection, and fraud detection, which are defined by the types
Besides the above described technical challenges, XAI faces of cyber attacks facing. The corresponding data such as
significant legal and privacy issues as well. In numerous emails, network traffic, and application activities will be
instances, including some well-known court cases, a history collected and processed in the next stages. Then features
of biased legal and privacy issues was made by XAI systems representing significant characteristics will be extracted and
[89]. fed to train different Artificial Intelligence models depending
Arun [82] proposed concerns about the role of XAI in on specific situations. Cyber security test samples will be
influencing the privacy calculus of individuals, especially the analyzed and made decisions after the models have been
privacy concerns of customers in marketing AI applications. trained. Users can get decisions and explanations explicitly
The European Commission (EC) has also published ethical from self-interpretable models whereas the predictions made
guidelines for Trustworthy AI as a legal document [83], by black-box modes require explanations of XAI models to
highlighting the respect for privacy, quality and integrity of make the users requesting for the cyber security tasks
data, and access to data. satisfied. It is noticeable that this diagram is only a general
The General Data Protection Regulation (GDPR) [84] of workflow of XAI applied in cyber security areas, and the
the EU has added clarification to its information security details may differ for different tasks specifically.
FIGURE 7 The conceptual framework diagram for XAI applications in cyber security.

Table 4 shows the details of the frequently used public


B. CYBER SECURITY DATABASES accessible datasets in the context of cyber attacks including
It is an undeniable fact that currently judicious selection and malware, Botnet, spam, DGA, DoS, CPSs, phishing, and
use of data is a pretty significant presence for cyber security network intrusion. It is noteworthy that there are some
research [92]. On the other hand, the quality and capacity of overlappings because some datasets contain several
data influence significantly the decisions of XAI models, categories of cyber attacks.
including DL-based models and ML-based models as well. On the other hand, Table 5 illustrates a comprehensive
Although cyber security data can be gathered overview of XAI applications for cyber security in distinct
straightforwardly by the use of numerous methods, like using industries including smart cities, healthcare, smart agriculture,
software tools like Win Dump or Wireshark to capture smart transportation, smart financial system, and Human-
network packets, these methods are mainly targeted and Computer Interaction(HCI). These industrial datasets can
appropriate for gathering narrow or low volumes of data show the potential of applying XAI for cyber security in
whereas high acquisition time and expenses will be required these domains.
[93]. Therefore, the utilization of benchmark cyber security
datasets can reduce the time spent on data gathering and V. XAI APPLICATIONS TO CYBER SECURITY
improve the effectiveness of research. Researchers can train, This section provides a comprehensive overview of XAI
verify, and evaluate XAI-based cyber security solutions applications in the areas of cyber security from different
using these benchmark datasets. In this section, we will viewpoints. We categorized these applications into 3 main
introduce and describe the most significant datasets groups: defensive applications of XAI against cyber attacks,
employed in cyber security from perspectives of different potentials of XAI applications for cyber security in different
categories of the most prominent cyber attacks and cyber industries, and cyber adversarial threats targeting XAI
security implementation in different industrial areas applications and defense approaches against these attacks.
respectively. Some important existing works under each of these domains
will be introduced in detail respectively.
TABLE 4. Some public available datasets in the context of cyber attacks categories.

Cyber Attack Dataset Name Cited


Categories Reference Year
Number Dataset Details
[94] N- BaIoT 2018 644 N-BaIoT contains real traffic (115 numerical features) of 9 commercial IoT devices infected with 2
IoT-based botnets, Mirai and BASHLITE.
[95] IoTPOT 2016 219 500 IoT malware samples from four key families are included in IoTPOT, which was compiled via
an IoT honeypot. And these IoT devices were running on different CPU architectures such as ARM,
MIPS, and PPC.
[96] IoT-23 2020 381 IoT-23 is a dataset of Internet of Things (IoT) device network traffic. In IoT devices, it has captured
20 malware executions and 3 benign IoT device traffic grabs.
[97] EMBER 2018 223 EMBER includes features extracted from 1.1M binary files 200K test samples and 900K training
Malware samples (300K harmful, 300K benign, and 300K unlabeled) (100K malicious, 100K benign).
[98] Genome Project 2012 2689 More than 1,200 malware samples covering the majority of the current Android malware families
were collected in this dataset and were systematically characterized from various aspects.
[99] VirusShare Updating N/A There are 48,195,237 samples of malware in the collection known as VirusShare. And it is
frequently utilized for malware analysis and detection and is primarily affected.
[100] CICAndMal201 2018 143 Created a new dataset called CI-CAndMal2017 and provide a methodical method to build Android
7 malware datasets using actual smartphones as opposed to emulators. More than 10,854 samples
(4,354 malware and 6,500 benign) were collected.
[101] DREBIN 2014 2102 DREBIN performs a thorough static analysis of the Android platform to gather as many features of
an application as feasible. 5,560 applications from 179 different malware families were collected.
[102] SMS Spam 2011 367 This dataset offered a new real, public, and non-encoded SMS spam collection.
v.1
[103] EnronSpam 2006 743 The Enron Corpus is a database of over 600,000 emails generated by 158 employees of the Enron
Spam Corporation.
[104] ISCX-URL2016 2016 100 Around 114,400 URLs were collected initially in this dataset containing benign and malicious
URLs in four categories: Spam, Malware, Phishing, and Defacement.
[105] NSL-KDD 2009 3730 To solve the issues of the KDD data set, a new data set, NSL-KDD, is proposed, which consists of
selected records of the complete KDD data set.
[106] UNB ISCX 2012 2012 1027 The Canadian Institute for Cybersecurity at the University of New Brunswick (UNB) established
UNB ISCX 2012 in 2012. Over seven days, traffic was recorded in a simulated network
environment.
[107] AWID 2016 365 A labeled dataset with an emphasis on 802.11 networks is called AWID [117. To collect WLAN
traffic in a packet-based format, a small network environment with 10 clients was created, and 15
distinct attacks were carried out.
[108] CIC-IDS2017 2018 1672 The CIC-IDS2017 dataset includes a variety of user-profiles (creating background traffic) and
multistage attacks such as Heartbleed and DDoS. Eighty traffic features were extracted using the
Network CICFlowMeter program.
Intrusion [109] CIC-DDoS2019 2019 309 The CIC-DDoS2019 dataset contains a wide variety of DDoS assaults that were executed utilizing
TCP/UDP application layer protocols.
[110] TON_IoT 2020 103 TON IoT dataset was constituted by the IoT traffic collected from a medium-scale network at the
Cyber Range and IoT Labs of the UNSW Canberra, Australia. Other types of IoT data include
operating system logs and telemetry data.
[111] LITNET-2020 2020 44 Feature vectors produced during 12 assaults on common computers installed on an academic
network are included in the LITNET-2020 dataset.
[112] ADFA-LD 2013 281 The ADFA-LD12 represents a worthy successor to the KDD collection. The most recent publicly
accessible exploits and techniques are used with a contemporary Linux operating system for this
new dataset.
[113] UNSW-NB15 2015 1419 This dataset contains two label attributes: the first label specifies the attack, while the second label
is binary. It also has 49 characteristics. This dataset takes into account assaults such as worms,
backdoors, shellcode, DoS assaults, generic assaults, exploits, and analysis assaults.
[114] CTU-13 2014 606 Raw pcap files for malicious, typical, and background data are included in the CTU-13 dataset. In
this dataset, the unidentified traffic is coming from a sizable network, the botnet attacks are real,
meaning that it is not a simulated dataset.
[108] CIC-IDS2017 2018 1672 The CIC-IDS2017 dataset includes a variety of user-profiles (creating background traffic) and
multistage attacks such as Heartbleed and DDoS. Eighty traffic features were extracted using the
CICFlowMeter program.
[115] ISOT Botnet 2011 325 The ISOT HTTP botnet dataset consists of two traffic captures malignant DNS information for nine
Botnet Dataset different botnets and benign DNS information for 19 different well-known software programs. And
the ISOT dataset is the combination of several existing publicly available malicious and non-
malicious datasets.
[116] BOT-IOT 2019 526 The proposed BOT-IOT Dataset is made up of three parts: network platforms, fictitious IoT
Dataset services, and features extraction and forensic analytics.
[98] Genome Project 2012 2689 More than 1,200 malware samples covering the majority of the current Android malware families
were collected in this dataset and were systematically characterized from various aspects.
[117] UMUDGA 2020 25 Proposed a comprehensive, labeled dataset with over 30 million AGDs arranged into 50 groups of
malware variants that are ready for ML.
DGA [118] AmritaDGA 2019 16 AmritaDGA is made up of two data sets. The first data collection is gathered from sources that are
openly accessible. The second set of information is gathered from a private real-time network.
Phishing [104] ISCX-URL2016 2016 100 Around 114,400 URLs were collected initially in this dataset containing benign and malicious
URLs in four categories: Spam, Malware, Phishing, and Defacement.
[119] HAI Dataset 1.0 2020 25 The HAI dataset was collected from a realistic industrial control system (ICS) testbed augmented
with a Hardware-In-the-Loop (HIL) simulator that emulates steam-turbine power generation and
CPSs pumped-storage hydropower generation.
[120] Power System 2014 248 This dataset consists of three datasets that measure the normal, disturbed, controlled, and
Attack Datasets cyberattack behaviors of the electric transmission system. The collection contains measurements
from relays, a simulated control panel, synchrophasor measurements, and data logs from Snort.
[121] InSDN Dataset 2020 50 A variety of attack types, including DoS, DDoS, Web, Password-Guessing, and Botnets, are
included in the InSDN dataset.
DoS [106] UNB ISCX 2012 2012 1027 The Canadian Institute for Cybersecurity at the University of New Brunswick (UNB) established
UNB ISCX 2012 in 2012. Over seven days, traffic was recorded in a simulated network
environment.
TABLE 5. Some public available datasets in the context of distinct industries.

Different Dataset Name Cited


Industry Reference Year Number Dataset Details
Verticals
[122] PPMI 2011 1059 The PPMI dataset will include 200 healthy volunteers and 400 recently diagnosed PD patients who
will be followed longitudinally for clinical, imaging, and biospecimen biomarker assessment at 21
clinical sites utilizing standardized data gathering techniques.
[123] CoAID 2020 133 This dataset included bogus news on websites and social media platforms, as well as consumers'
social engagement with such material. CoAID (Covid-19 heAlthcare mIsinformation Dataset)
featured a variety of COVID-19 healthcare misinformation. CoAID has 4,251 news items, 296,000
user interactions, 926 posts on social media sites regarding COVID-19, and ground truth labels.
[124] Heart Disease 2020 27 The Heart Disease Cleveland UC Irvine dataset uses 13 factors to predict whether or not a person
Cleveland UCI has heart disease. Reprocessing was done using the 76 feature original dataset.
[125] MIMIC-III 2016 4140 MIMIC-III (‘Medical Information Mart for Intensive Care’) is a sizable, single-center database that
contains data on people who have been admitted to tertiary care hospitals' critical care units.
[126] MIMIC-II 2011 1104 There were 25,328 stays in intensive care units in the MIMIC-II database. Laboratory data,
therapeutic intervention profiles like vasoactive medication drip rates and ventilator settings,
nursing progress notes, discharge summaries, radiology reports, and provider order entry data were
Healthcare all collected by the researchers during their detailed examination of intensive care unit patient stays.
[127] PTB-XL 2020 171 This 10-second-long 12-lead ECG-waveform dataset has 21837 records from 18885 patients. Up to
two cardiologists annotated the ECG waveform data as a multi-label dataset with diagnostic labels
further grouped into super and subclasses.
[128] BreakHis 2016 725 BreakHis was composed of 9,109 microscopic images of breast tumor tissue collected from 82
patients using different magnifying factors (40X, 100X, 200X, and 400X). To date, it contains
2,480 benign and 5,429 malignant samples
[129] CPSC2018 2018 204 One normal ECG type and eight abnormal ECG types are part of the data utilized in
dataset CPSC2018. This study describes the data source, recording details, and clinical baseline
characteristics of patients, such as age, gender, and so on. It also describes the typical procedures
for detecting and categorizing the aberrant ECG patterns mentioned above.
[130] REMBRANDT 2018 90 The 671 cases in the Rembrandt brain cancer dataset were gathered from 14 collaborating
institutions between 2004 and 2006. It is available for use with the Georgetown Database of Cancer
(G-DOC) open access platform for undertaking clinical translational research.
[131] GlioVis 2016 446 GlioVis contains over 6500 tumor samples of approximately 50 expression datasets of a large
collection of brain tumor entities (mostly gliomas), both adult and pediatric.
[132] Cologne 2013 327 During 700.000 individual car excursions are included in the resultant synthetic trace of the car
Vehicular traffic in the city of Cologne, which spans a 400 square kilometer area over the course of a normal
mobility trace working day.
[133] PKLot 2015 227 695,899 photos from two parking lots were collected for this new parking lot dataset using three
different camera perspectives. The acquisition methodology enables the collection of static
photographs illustrating variations in illumination on sunny, cloudy, and wet days.
[134] PEMS-SF Data 2011 362 This dataset describes the various car lanes on the motorways in the San Francisco Bay area's
Smart Set occupancy rate, which ranges from 0 to 1. Every ten minutes, samples are taken from the
Transportation measurements, which span the period from January 1st 2008 to March 30th 2009.
[135] CNRPark+EXT 2017 282 The CNRPark+EXT dataset, which was created on a parking lot with 164 spaces, has around
150,000 annotated pictures (patches) of vacant and occupied parking places.
[136] VED 2020 24 This open dataset records the GPS positions of moving objects combined with time-series data on
their consumption of fuel, energy, speed, and auxiliary power. Between November 2017 and
November 2018, a diversified fleet of 264 gasoline vehicles, 92 HEVs, and 27 PHEV/EVs were on
the road. The data were gathered using onboard OBD-II recorders. The types of driving situations
and seasons range from highways to congested city areas.
[137] T-Drive 2011 826 The dataset tracks 10357 taxi movements in Beijing over the course of one week, from February 2
to February 8, 2008. Using longitude and latitude, this data displays the location of a cab
continuously throughout a range of time periods.
[138] GeoLife GPS 2009 2328 The dataset captured a trajectory position that tracks 182 mobile users in Beijing, China, over the
Smart Cities Trajectories course of three years, from April 2007 to October 2011. Over 48,000 hours and nearly 1.2 million
kilometers are covered throughout the complete journey.
[139] KITTI 2013 5831 A cutting-edge dataset obtained from a Volkswagen station wagon for use in studies on mobile
robotics and autonomous driving. a range of sensor modalities, including high-resolution color and
grayscale stereo cameras, a Velodyne 3D laser scanner, and a high-precision GPS/IMU inertial
navigation system, were used to record 6 hours' worth of traffic scenarios at 10-100 Hz in total.
[140] Images on plant 2015 550 Through the current web platform PlantVillage, this dataset made available over 50,000 highly
health curated photos of healthy and diseased leaves of crop plants.
[141] PS-Plant 2019 36 Presented PS-Plant, a low-cost and portable 3D plant phenotyping platform based on an imaging
Smart technique novel to plant phenotyping called photometric stereo (PS).
Agriculture [142] Plant Pathology 2020 14 3,651 high-quality, realistic photos showing the symptoms of various apple foliar diseases were
recorded in this collection, together with variations in noise, illumination, angles, and surfaces. The
Kaggle community was given access to a subset that had been expertly annotated to provide a
prototype dataset for apple scab, cedar apple rust, and healthy leaves.
[143] Clarkson 2015 73 This dataset offered a brand-new keystroke dataset that includes 39 users' transcribed text, free text,
and short sentences. This dataset can be used to recreate the authentication performance that was
seen in earlier studies. However, all participants are required to complete the same set of
predetermined activities in a university lab using the same HTML form and desk-top computer.
HCI [144] Torino 2005 607 Although the Orino dataset is similarly gathered using a predefined HTML form, participants are
free to use any keyboard and complete their tasks at home rather than in a lab.
[145] Buffalo 2016 51 This dataset included unprocessed keystroke data from 157 participants who were permitted to
freely transcribe fixed text and respond to questions. The dataset is designed to capture the temporal
changes in typing habits as well as the disruptions brought on by various keyboard layouts.
[146] Nielsen Dataset 2017 32 This information was gathered between 2006 and 2010 at 35,000 participating mass merchandisers,
pharmacies, and grocery stores spread over 55 MSA (metropolitan statistical areas) in the United
Smart States.
Financial [147] Statlog (German 1994 N/A The German Credit Data provides information on 20 criteria and classification of 1000 loan
System Credit Data)
Data Set applicants as either Good or Bad Credit Risks. Also comes with a cost matrix.
FIGURE 8 The overview of some common types of cyber attacks.

A. XAI APPLICATIONS IN DEFENDING AGAINST


application takes a lot of time and resources. Therefore,
CYBER ATTACKS many AI-based malware detection systems, especially DL
XAI is playing an increasingly significant role in fighting a algorithms are utilized to detect malware with higher better
wide range of cyber attacks, as shown in Figure 8. In this performance and fewer resources than traditional malware
subsection, we analyzed the state-of-art XAI-based defense detecting methods [150]. However, the working functions of
systems for different categories of cyber attacks. And the neural networks are similar to a black box, and this topology
conjunctions of these systems with XAI topologies are offers no indication of how it operates [151]. Due to similar
shown in Table 6 as well. motivations, many researchers deploy different categories of
XAI approaches in different degrees to make the AI-based
1) MALWARE
One of the major cyber security risks on the Internet today is malware detection systems more explainable and transparent
malware, and implementing effective defensive measures so that a reliable malware detector can continue to perform
necessitates the quick analysis of an ever-growing volume of well when deployed to a new environment.
malware quantities [148]. Existing techniques for malware There are multiple ways to explain the malware detector.
detection can be categorized into two main types: Static Identifying the most significant local features can always
detection and Dynamic detection [149]. Static malware provide valuable explanations for malware detection
detection analyzes the malware binary without actually decisions. Marco et al. [152] implemented a gradient-based
running the code. Instead, the decompilation tool is utilized approach to identify the most influential features contributing
to obtain the decompiled codes and the included instructions to each decision. A popular Android malware detector named
are inspected. However, this kind of strategy can be easily Drebin [153] extracted the information from the Android
countered by using evading methods like obscuring and applications. The explainabilities of Drebin on non-linear
incorporating syntax flaws. On the other hand, dynamic algorithms, including Support Vector Machines (SVMs) and
malware detection entails executing the malware codes on Random Forests (RFs) are retained by both local
the testing system and monitoring how it behaves. explanations and global explanations. The top 10 important
In practice, using these conventional malware detection features, sorted by their applicability values are disclosed for
techniques and manually analyzing every malware file in an 3 different cases whereas the AUC remains above 0.96.
For neural network-based detecting mechanisms, Shamik decision would be distributed certain values to each set
et al. [154] proposed a framework explaining how a deep respectively, showing the contribution of different sets of
neural network generalizes real-world testing set in different features to the detection results. The detection rates of TCP
layers. The gradients and weights of different layers of the flow and HTTP models reach 98.16% and 99.65% while the
MalConv architecture [155] and emberMalConv [156] are false positive rates are 5.14% and 1.84%.
analyzed to identify different parts’ contributions to the An explainable fast, and accurate approach for detecting
classification. High gradient values were found in the header Android malware called PAIRED was illustrated by
of the files while there are peaks elsewhere, demonstrating Mohammed et al. in [161]. The proposed detection system
that these parts are mostly responsible for classification achieved lightweight by reducing the number of features by a
results. Besides, two filters A and B learned two different factor of 84% and deploying classifiers that are not resource-
sets of features, the accuracy and F1-Score can achieve intensive. 35 static features were extracted and explained
91.2% and 90.7% respectively when model B was replaced later by SHAP methods. In the experiment, PAIRED
by model A. malware detection system was able to retain a very high
Hamad et al. [157] developed a pre-trained Inception-v3 accuracy of 97.98% while processing data in just 0.8206µs
CNN-based transfer learned model to analyze malware in by testing with the CICMalDroid2020 dataset with the
IoT devices. To better understand the features learned by the extracted 35 features.
CNN models, Gradient weighted class activation mapping Martin et al. [162] presented a novel way to find locations
(Grad-CAM) is utilized to generate cumulative heatmaps and in an Android app's opcode sequence that the CNN model
explain the models visually. Besides, t-distributed stochastic considered crucial and that might help with malware
neighbor embedding (t-SNE) is used to verify the density of detection. CNN was demonstrated to assign a high priority in
the features in the proposed CNN models. Achieved by the locations similar to those highlighted by LIME as the state-
suggested methods, the detection accuracies were 98.5% and of-the-art for highlighting feature relevance on the
96.9% on the available testing dataset with SoftMax benchmark Drebin [101] dataset. And satisfying
classifier and RF classifier respectively. experimental results were produced as well, including
Anli et al. [158] suggested a technique for extracting rules accuracy = 0.98, precision =0.98, recall = 0.98, and F1-Score
from a deep neural network so that the rules can be used to = 0.97.
identify mobile malware behaviors. To represent the rules 2) SPAM
discovered between the inputs and outputs of each hidden Due to the increasing number of Internet users, spam has
layer in the deep neural network, an input-hidden tree and a become a major problem for Internet users in recent years
single hidden-output tree for each hidden layer were [163]. According to [164], while over 306.4 billion emails
established. Then the hidden-output tree can tell the most were sent and received per day in 2021, spam emails
important hidden layer which could specify the related input- accounted for more than 55 percent of all emails sent in 2021,
hidden tree. The experimental results illustrated accuracy, meaning that unsolicited email messages accounted for
precision, recall, and F-Measure of the proposed method nearly half of all email traffic.
were 98.55%, 97.93%, 98.27%, and 98.04% respectively. Recently, AI-based systems can be regarded as an efficient
Giacomo et al. [159] offered a way for assessing DL option to tackle the spam issue primarily because of their
models for malware classification using image data. It uses ability to evolve and tune themselves [165]. However, due to
data from a Grad-CAM and makes an effort to extend the the privacy and legal specialties of spam, users can ask many
evaluation of the training phase of the models being studied questions about AI models, especially the black-box ML and
and provide visual information to security analysts. Besides, DL models [166]. For instance, a curious spam recipient can
this technique extends the use of the Grad-CAM and, in have an interest in understanding the utilized AI models and
addition to the cumulative heatmap, automates the analysis of ask the following questions:
the heatmaps, assisting security analysts in debugging the 1) Why is Message classified as spam by Model?
model without having any prior knowledge of the
issue/pattern in question. Over a testing dataset of more than 2) What distinguishes spam from no spam?
8,000 samples classified into 7 families, the proposed model 3) How does Model distinguish spam from no spam?
tested in the experimental study had a test accuracy of 97%. 4) How does Model work distinguishing an alternative
However, the limitation of this approach is the morphed spam filter Model′ used in the past?
version of the malicious sample belonging to the family can 5) How does Model work?
evade antimalware detection. These proposed questions can be answered by the
TrafficAV, an effective and explainable detection implementation of XAI algorithms and XAI algorithms
framework of mobile malware behavior using network traffic can be used to complement ML models with desired
was proposed by Shanshan et al. [160]. This framework properties, such as explainability and transparency [167].
provided explainability to users by defining four sets for each And many works of literature have studied this area to
feature extracted from the malware HTTP request and every enhance the trust of the AI-based spam filters.
Julio et al. [168] conducted a highly exploratory the botnet detecting systems’ trust and prevent automation
investigation on fake spam news detection with ML bias when users have too much trust in the systems’ output.
algorithms from a large and diverse set of features. SHAP In [178], HATMA et al. proposed a novel model for
method was deployed to explain why some are classified botnet DGA detection. Five ML algorithms were utilized and
as fake news whereas others are not by representative tested with datasets of 55 botnet families. Random Forest
models of each cluster. Novel features related to the source achieved the best accuracy of 96.3% and outperformed
domain of the fake news are proposed and demonstrated previous works as well. Open-source intelligence (OSINT)
five times more frequencies appeared in the detection and XAI techniques including SHAP and LIME were
models than in other features. Besides, only 2.2 percent of combined in this work to provide an antidote for skepticism
the models have a detection performance higher than 0.85 toward the model’s output and enhance the system trust.
in terms of AUC, which highlights how difficult it is to Besides, the limitations of the proposed frameworks were the
identify bogus news. temporal complexity involved in calculating the
The legally required trade-off between accuracy and characteristics and the model's low resistance to Mask botnet
explainability was discussed and demonstrated in the assaults.
context of spam classification by Philipp et al. in [169] as Shohei et al. [179] presented a novel two-step clustering
well. A dataset of 5574 SMS messages [170] was used to approach based on DBSCAN to cluster botnets and classify
support the argument that it is equally important to select their categories. Important features were represented and
the appropriate model for the task at hand in addition to explained by combining subspace clustering and frequent
concentrating on making complex models understandable. pattern mining from 2 different real-world flow datasets:
In this work, under circumstances, that which just a small MAWI [180] and ISP. 60 bot groups from 61,167 IP
quantity of annotated training data is available, very addresses were categorized from the MAWI dataset whereas
simple models, such as Naive Bayes, can outperform more 295 bot groups from 408,118 IP addresses from the ISP
complicated models, such as Random Forests. dataset. And the cluster results of botnets were self-explained
HateXplain, a benchmark dataset for hate speech spam by using a dendrogram.
that considers bias and explainability from many angles Visualization tools are also used to give better
was introduced by Binny et al. in [171]. Several models explanations about the reasons for labeling an account as
including CNN-GRU [172], BiRNN [173], and BiRNN- botnet or legitimate. Michele et al. [181] suggested ReTweet-
Attention [174] were used and tested on this dataset Tweet (RTT), a small but informative scatterplot
whereas explainability-based metrics such as Intersection- representation to make it simpler to explore a user's
Over-Union (IOU), comprehensiveness, and sufficiency retweeting activities. While the proposed botnet detection
were utilized to evaluate the model interpretability. method Retweet-Buster (RTbust) based on Variational
Experimental results showed that models that succeed at autoencoders (VAEs) and long short-term memory (LSTM)
classification may not always be able to explain their network unsupervised feature extraction approaches were
conclusions in a way that is believable and accurate. The utilized in a black-box nature, the visualization tool RTT can
limitations behind this benchmark dataset are that external still be employed economically after RTbust has been
contexts that would be relevant to the classification task, applied to comprehend the traits of those accounts that have
such as the profile bio, user gender, and post history were been classified as bots.
not considered and the proposed dataset contained English Some researchers suggested the necessity to reduce the
language only. number of the required features for botnet classification to
3) BOTNET overcome the scalability and computation resource problems
A botnet attack is known as a group of connected computers and provide more reliable explanations in botnet detection
working together to carry out harmful and repetitive actions systems. In [182], Hayretdin et al. utilized Principal
to corrupt and disrupt the resources of a victim, such as Component Analysis (PCA) for feature dimension reduction
crashing websites [175]. As shown in Figure 9, a typical Decision Tree classifier preserved the original features and
botnet’s lifecycle contains 5 phases, including Initial clearly illustrated how the classifier determined the labels.
Injection, Secondary Injection, Connection, Malicious Therefore, An analyst for cyber security can quickly
Activities, and Maintenance and Updating. comprehend an attack or typical behavior and utilize this
The market for global botnet detection is anticipated to understanding to further interpret a security event or incident.
expand from US$207.4 million in 2020 to US$965.6 million With the rise of DL (DL), several pilot studies have been
in 2027, at a compound annual growth rate (CAGR) of 24.0 created to understand the behavior of botnet traffic. However,
percent from 2021 to 2027, according to [176]. And Imperva It is difficult for users to understand and put their trust in the
Research Labs [177] also found that botnets constituted 57% outcomes of present DL models because of neural networks’
of all attacks against e-commerce websites in 2021. These poor decision-making and lack of transparency compared to
statistics indicate that developing AI-based systems for other approaches. To address this issue, Partha et al. [183]
detecting botnets is necessary. Besides, XAI can contribute to carried out in-depth tests using both synthetic and
including Naive Bayes, Logistic Regression, Decision Tree,
Random Forest, Gradient Boosting, Neural Network,
Autoencoder, and Isolation Forest whereas LIME and SHAP
provided explanations for the detection results of each model
respectively. It was noticed that while SHAP gives more
reliable explanations, LIME is faster. Therefore, this paper
suggested that combining the two approaches may be
advantageous, with SHAP being used to facilitate regulatory
compliance and LIME being used to offer real-time
explanations for fraud prevention and model accuracy
analysis.
David et al. [189] investigated how existing XAI
algorithms may be used to explain specific predictions for
prescriptive solutions and derive more information about the
FIGURE 9 The typical lifecycle of a botnet.
causes of cyber-fraud in the iGaming industry. ML
algorithms including RF, LGB, DT, and LR were utilized to
actual network traffic produced by the IXIA BreakingPoint analyze a dataset with a sample size of 197,733. Besides, this
System and the results showed that the proposed DCNN study also proved the existence of data drift and suggested
botnet detection models outperformed the existing ML monthly retraining for the model to remain consistently
models with an improvement of up to 15% for all updated. Furthermore, to identify the features that
performance metrics while SHAP was deployed to provide a contributed most significantly to that particular case and to
clear explanation of the model decision and gain the trust of quantify that same contribution, this study employed locally
the end users. faithful explanations. These explanations take the form of
BotStop, a packet-based and ML-based botnet detection mathematical inequalities that reflect feature conditions, and
solution aimed at testing the incoming and outgoing network each condition is assigned a relative strength. One of the
traffic in an IoT device to stop botnet infections, was research’s limitations would be the manually labeled dataset,
introduced by Mohammed in [184]. The suggested method which could have added bias and human error to our analysis.
additionally emphasized feature selection to utilize only XFraud, an explainable fraud transaction prediction
seven features to train an extremely accurate ML classifier. framework composed of a detector and an explainer, was
The trained classifier surpassed all methods from similar presented by Susie et al. in [190]. A heterogeneous GNN
work with an accuracy of 0.9976, an F1-Score of 0.9968, and model for transaction fraud detection was proposed and
a testing duration of 0.2250 μs. Besides, very low FN and FP tested on industrial-scale datasets. Heterogeneity in
rates of 0.21 percent and 0.31 percent were attained using the transaction graphs was captured whereas the presented
suggested approach as well. SHAP explanation is used to methodology outperformed previous models HGT [191] and
explain the proposed model to make the classifier prediction GEM [192]. Besides, the weights learned by the
process transparent. GNNExplainer and the edge weights calculated using
4) FRAUD centrality measures were compared and traded off to
According to [185], during the tightest periods of the compute a hybrid explainer in XFraud. The computed hybrid
lockdown during the Covid-19 epidemic, there were XFraud explainer calculated the contributions of its
observed rises in personal account hacking and online surrounding node types and edges and also paid attention to
financial fraud. In the UK, fraud costs businesses and global topological aspects discovered by centrality metrics.
individuals £130 billion per year, while it costs the XAI methods can also be utilized to improve the
worldwide economy $3.89 trillion [186]. Therefore, to deal performance of the fraud detection models. In [193],
with this issue, numerous financial services, have the Khushnaseeb et al. proposed SHAP_Model based on the
potential to benefit from the use of AI systems to defend autoencoder for network fraud detection using SHAP values,
against fraud attacks. However, there are still practical implemented in a subset of the CICIDS2017 dataset and
challenges with the complete implementation of AI methods, achieved overall accuracy and AUC of 94% and 96.9%
and some focus on comprehending and being able to explain respectively. The top 30 features with the highest SHAP
the judgments and predictions produced by complicated values, playing a more significant role in causing abnormal
models by XAI [187]. behavior in fraud detection than any other features, were
Ismini et al. [187] investigated explanations for fraud employed to build the SHAP_Model. Experimental results
detection by both supervised and unsupervised models using demonstrated that the SHAP_Model outperformed the model
two of the most used techniques, LIME and SHAP. The open based on all features and the model based on 39features
source IEEE-CIS Fraud Detection dataset [188] was tested extracted by unsupervised learning.
on 8 popular supervised and unsupervised AI models
Yongchun et al. [194] proposed a Hierarchical Explainable with the most attention would be regarded as the most
Network (HEN) to represent user behavior patterns, which important content contributing to the final decision.
could help with fraud detection while also making the Paulo et al. [200] utilized LIME and EBM explanation
inference process more understandable. Furthermore, a techniques based on malicious URLs for a phishing
transfer framework was suggested for knowledge transfer experiment on a publicly available dataset Ebbu2017 [201].
from source domains with sufficient and mature data to the EBM, Random Forest, and SVM classifiers rated accuracy of
target domain to address the issue of cross-domain fraud 0.9646, 0.9732, and 0.9469 respectively on the tested
detection. database. The empirical evidence supported that the models
A novel fraud detection algorithm called FraudMemory could accurately categorize URLs as phishing or legitimate,
was proposed in [195] by Kunlin et al. This methodology and they also added explainability to these ML models,
used memory networks to enhance both performance and improving the final classification outcome.
interpretability while using a novel sequential model to Visual explanations of the phishing detection system
capture the sequential patterns of each transaction. Besides, attracted attention in the work of Yun et al. [202] as well.
memory components were incorporated in FraudMemory to The proposed phishing website detection method Phishpedia
possess high adaptability to the existence of the concept drift. solved the challenging issues of logo detection and brand
The precision and AUC of the FraudMemory model were recognition in phishing website detection. Both high
0.968 and 0.969 respectively and performed better than any accuracy and little runtime overhead are attained via
other methods for comparison including SVM, DNN, RF, Phishpedia. And most crucially, unlike conventional methods
and GRU. such as EMD, PhishZoo, and LogoSENSE, Phishpedia does
Based on a real-world dataset and a simulated dataset, not demand training on any specific phishing samples.
Zhiwen and Jianbin [196] proposed an explainable Moreover, Phishpedia was implemented with the CertStream
classification approach within the multiple instance learning service, and in just 30 days, we found 1,704 new genuine
(MIL) framework that deployed the AP clustering method in phishing websites, far more than other solutions. In addition,
the self-training LSTM model to obtain a precise explanation. 1,133 of these were not flagged by any engines in VirusTotal.
The experimental results indicated that the presented Rohit et al. [203] proposed an anti-phishing method that
methodology surpassed the other 3 benchmark classifiers utilizes persuasion cues and investigated the effectiveness of
including AP, SVM, and RF in both 2 datasets. Only a few persuasion cues. Three ML models were developed with
classification methods that can produce a straightforward pertinent gain persuasion cues, loss persuasion cues, and
casual explanation is the one used in this study. combined gain and loss persuasion cues, respectively, to
Wei et al. [197] proposed a DL-based behavior respond to the research questions. We then compare the
representation framework for clustering to detect fraud in results with a baseline model that does not take the
financial services, called FinDeepBehaviorCluster. Time persuasion cues into account. The findings demonstrate that
attention-based Bi-LSTM was used to learn the embedding the three phishing detection models incorporating pertinent
of behavior sequence data whereas handcrafted features were persuasion cues considerably outperform the baseline model
deployed to provide explanations. Then a GPU-optimized in terms of F1-score by a range of 5% to 20%, making them
HDBSCAN algorithm called pHDBSCAN is used for effective tools for phishing email detection. In addition, the
clustering transactions with similar behaviors. The proposed use of the theoretical perspective can aid in the creation of
pHDBSCAN has demonstrated comparable performance to models that are comprehensible and can understand black-
the original HBDSCAN in experiments on two real-world box models.
transaction data sets but with hundreds of times greater 6) NETWORK INTRUSION
computation efficiency. An unauthorized infiltration into a computer in your
5) PHISHING company or an address in your designated domain is referred
Phishing refers to fake email messages that look to be sent by to as a network intrusion. On the other hand, Network
a well-known company. The intention is to either download Intrusion Detection Systems (NIDSs) are defined as
malicious software onto the victim's computer or steal monitoring network or local system activity for indications of
sensitive data from it, including credit card numbers and unusual or malicious behavior that violates security or
login credentials. Phishing is a form of online fraud that is accepted practices [36]. Recently, many works have adopted
gaining popularity [198]. ML and DL algorithms for building efficient NIDSs. In
Yidong et al. [199] proposed a multi-modal hierarchical addition, cyber security experts also consider introducing
attention model (MMHAM) that, for phishing website explainability to the black-box AI systems to make the
detection, jointly learned the deep fraud cues from the three NISDs more robust and many have tried with XAI [204].
main modalities of website content including URLs, text, and Pieter et al. [204] proposed a two-staged pipeline for
image. Extracted features from different contents would be robust network intrusion detection, which deployed XGBoost
aligned representations in the attention layer. This in the first phase and Autoencoder in the second phase.
methodology is self-explained because content distributed SHAP method was implemented to explain to the first stage
model whereas the explanation results were utilized in the understandability of intrusion detection alerts. The proposed
second stage to train the autoencoder. Experiments in the framework will help cyber analysts make better decisions
public corpus NSL-KDD [105] showed that the proposed because false positives will be quickly eliminated. Five
pipeline can outperform many state-of-the-art efforts in terms functional modules were identified in FAIXID framework:
of accuracy, recall, and precision with 93.28%, 97.81%, and the pre-modeling explainability model, the modeling module,
91.05% respectively on the NSL-KDD dataset while adding the post-modeling explainability module, the attribution
an extra layer of explainability. module, and the evaluation module. XAI algorithms
ROULETTE, an explainable network intrusion detection including Exploratory Data Analysis (EDA), Boolean Rule
system for neural attention multi-output classification of Column Generation(BRCG), and Contrastive Explanations
network traffic data was introduced by Giuseppina et al. in Method (CEM) were deployed in the pre-modeling
[205]. Experimentations were performed on two benchmark explainability model, the modeling module, and the post-
datasets, NSL-KDD [105] and UNSW-NB15 [113] to modeling explainability module respectively to provide
demonstrate the effectiveness of the proposed neural model cybersecurity analysts comprehensive and high-quality
with attention. The additional attention layer enables users to explanations about the detection decisions made by the
observe specific network traffic characteristics that are most framework. On the other hand, collecting analysts’ feedback
useful for identifying particular intrusion categories. Two through the evaluation module to enhance the explanation
heatmaps depicting the ranked average feature relevance of models by data cleaning also proved effective in this work as
the flow characteristics in the attention layer of the above 2 well.
datasets were provided to show the explanation. Shraddha et al. [211] proposed a system where the
Zakaria et al. [206] designed a novel DL and XAI-based relations between features and system outcome, instance-
system for intrusion detection in IoT networks. Three wise explanations, and local and global explanations aid to
different explanation methods including LIME, SHAP, and get relevant features in decision making were identified to
RuleFit were deployed to provide local and global help users to comprehend the patterns that the model has
explanations for the single output of the DNN model and the learned by looking at the generated explanations. If the
most significant features conducted to the intrusion detection learned patterns are incorrect, they can alter the dataset or
decision respectively. Experiments were operated on NSL- choose a different set of features to ensure that the model
KDD [105] and UNSW-NB15 [113] datasets and the learns the correct patterns. XAI methods including SHAP,
performance results indicated the proposed framework's LIME, Contrastive Explanations Method (CEM), ProtoDash,
effectiveness in strengthening the IoT IDS's interpretability and Boolean Decision Rules via Column Generation (BRCG)
against well-known IoT assaults and assisting cybersecurity were implemented at different stages of the framework so
professionals in better comprehending IDS judgments. that the neural network not being a black box. The
Yiwen et al. [207] presented an intrusion detection system experiment was performed on the dataset NSL-KDD [105]
aimed at detecting malicious traffic intrusion in networks and the proposed framework was applied to generate
such as flood attacks and Ddos attacks. This method was explanations from different perspectives.
XAI-based and deployed both neural networks and tree The Decision Tree algorithm was utilized by Basim et al.
models. It is noticeable that this approach decreased the in [212] to enhance trust management and was compared
number of convolution layers in the neural work to enhance with other ML algorithms such as SVM. By applying the
the model’s explainability whereas the accuracy performance Decision Tree model for the network intrusion of benchmark
of the model was not sacrificed. XGBoost was implemented dataset NSL-KDD [105], three tasks were performed:
to process the prediction outputs of the neural network and ranking the features, decision tree rule extraction, and
the processed results would be fed to LIME and SHAP for comparison with the state-of-the-art algorithms. The ranking
further explanations. of network features was listed and it is noticeable that not all
A novel intrusion detection system known as BiLSTM- features contributed to the decision of intrusion. Besides, the
XAI was presented by S. Sivamohan et al. in [208]. Krill advantages of the Decision Tree algorithm compared with
herd optimization (KHO) algorithm was implemented to other popular classifiers, being computationally cheaper and
generate the most significant features of two network easy to explain were also demonstrated in this work.
intrusion datasets, NSL-KDD [105] and Honeypot [209], to Syed et al. [213] suggested an Intrusion Detection System
reduce the complexities of BiLSTM model and thus enhance that used the global explanations created by the SHAP and
the detection accuracy and explainability. The obtained Random Forest joint framework to detect all forms of
detection rate of Honeypot is 97.2% and the NSL-KDD malicious intrusion in network traffic. The suggested
dataset is 95.8% which was superior and LIME and SHAP framework was composed of 2 stages of Random Forest
were deployed to explain the detection decisions. classifiers and one SHAP stage. SHAP provided explanations
Hong et al. [210] suggested a network intrusion detection for the outcome of the initial Random Forest classifier and
framework called FAIXID making use of XAI and data one decision of the first Random Forest classifier with low
cleaning techniques to enhance the explainability and credibility would be reassessed by the secondary classifier.
This three-stage based architecture can increase user trust necessarily reflect how the model classifies this data,
while filtering out all cloaked dangerous network data by especially when there are numerous equally valid
introducing transparency to the decision-making process. explanations.
CSE-CIC IDS 2018 [214] dataset was utilized to evaluate the EXPLAIN, a feature-based and contextless DGAs
performance of the proposed framework and the presented multiclass classification framework was introduced by
architecture produced accuracy rates of 98.5 percent and 100 Arthur et al. in [219] and compared with several state-of-the-
percent, respectively on the test dataset and adversarial art classifiers such as RNN, CNN, SVM, RF, and ResNet
samples. based on real-world datasets including DGArchive [220] and
Tahmina et al. [215] proposed an XAI-based ML system University Network [221]. After the ResNet-based
to detect malicious DoH traffic within DNS over HTTPS techniques, the best model, EXPLAIN-OvRUnion, used 76
protocol. publicly available CIRA-CIC-DoHBrw-2020 features and achieves the best F1-score. Moreover, Only 28
dataset [216] was utilized in the testing of the proposed features were used by EXPLAIN-OvRRFE-PI and
Balanced and Stacked Random Forest framework and other EXPLAIN-RFRFE-PI, which outperformed all feature-based
ML algorithms including Gradient Boosting and Generic strategies put out in previous work by a significant margin.
Random Forest. The suggested approach in this work got Additionally, they outperformed the DL-based algorithms M-
slightly greater precision (99.91 percent), recall (99.92 Endgame, M-Endgame.MI, and M-NYU in terms of F1-
percent), and F1 score (99.91 percent) over other methods for scores as well.
comparison. Additionally, feature contributions to the To address the issues of DGAs classification including
detection results were also highlighted with the help of the which traffic should be trained in which network and when,
SHAP algorithm. The limitation of this framework would be and how to measure resilience against adversarial assaults,
the inconsideration of DGA-related DoH traffic from other Arthur et al. [222] proposed two ResNets-based DGAs
HTTPS traffic. detection classifiers, one for binary classification and the
7) DOMAIN GENERATION ALGORITHMS (DGA) other for multiclass classification. Experiments on real-world
DGAs are a type of virus that is frequently used to generate a datasets demonstrated that the proposed classifier performed
huge number of domain names that can be utilized for at least comparably to the best state-of-the-art algorithms for
evasive communication with Command and Control (C2) the binary classification test with a very low false positive
servers. It is challenging to prohibit harmful domains using rate, and significantly outperformed the competition in the
common approaches like blacklisting or sink-holing due to extraction of complex features. In addition, for the multiclass
the abundance of unique domain names. A DGA's dynamics classification problem, the ResNet-based classifier performed
widely used a seeded function. Deterring a DGA strategy better than previous work in attributing AGDs to DGAs for
presents a hurdle because an administrator would need to the multiclass classification problem, achieving an
recognize the virus, the DGA, and the seed value to filter out improvement of nearly 5 percent in F1-score while requiring
earlier dangerous networks and subsequent servers in the 30 percent less training time than the next best classifier. In
sequence. The DGA makes it more challenging to stop the explainability analysis, it was also highlighted that some
unwanted communications because a skilled threat actor can of the self-learned properties employed by the DL-based
sporadically switch the server or location from which the systems.
malware automatically calls back to the C2 [217]. Therefore, 8) DENIAL-OF-SERVICE (DOS)
blacklisting and other conventional malware management The Internet is seriously threatened by denial-of-service
techniques fall short in combating DGA attacks and many (DoS) assaults, and numerous protection measures have been
ML classifiers have been suggested. These classifiers allow suggested to address the issue. DoS attacks are ongoing
for the identification of the DGA responsible for the creation attacks in which malicious nodes produce bogus messages to
of a given domain name and consequently start targeted obstruct network traffic or drain the resources of other nodes
remedial actions. However, it's challenging to assess the [223]. As the DoS attacks become increasingly complicated
inner logic due to the black box aspect and the consequent in the past years, conventional Intrusion Detection Systems
lack of confidence makes it impossible to use such models. (IDS) are finding it increasingly challenging to identify these
Franziska et al. [218] proposed a visual analytics newer, more sophisticated DoS attacks because they use
framework that offers clear interpretations of the models more complicated patterns. To identify malicious DoS
created by DL model creators for the classification of DGAs. assaults, numerous ML and DL models have been deployed.
The activations of the model's nodes were clustered, and Additionally, for the goal of model transparency, XAI
decision trees were utilized to illuminate these clusters. The methods that investigate how features contribute to or impact
users can examine how the model sees the data at different an algorithm-based choice can be helpful [224].
layers in conjunction with a 2D projection. A drawback of
the proposed strategy is that although the decision trees can
provide a possible explanation for the clusters, this does not
TABLE 6. Details of XAI applications in defending mechanisms against different categories of cyber attacks.

Cyber Learning models XAI techniques


attack types Reference Year Local Global Model- Model- Post-hoc Intrinsic Text Visual Arguments Models XAI
specific agnostic methods
[150] SVM and RF 2018 √ √ √ √ √ gradient
[154] DNN 2020 √ √ √ √ √ heatmap
[157] CNN 2020 √ √ √ √ √ √ Grad-CAM
[158] DNN 2021 √ √ √ √ Generated
trees
Malware [159] CNN 2021 √ √ √ √ Grad-CAM
, heatmap
[160] DT 2016 √ √ √ √ Self
explainable
[161] RF, LR, DT, 2022 √ √ √ √ √ SHAP
GNB, and SVM
[162] CNN 2021 √ √ √ √ √ LIME
[168] XGBoost 2019 √ √ √ √ √ SHAP
Spam [169] NB and RF 2020 √ √ Self
explainable
[171] RNN and CNN 2021 √ √ √ √ √ LIME
[178] RF, NB, and LR 2022 √ √ √ √ √ LIME and
SHAP
[179] DBSCAN 2019 √ √ √ √ √ Self
explainable
[181] VAEs and LSTM 2019 √ √ √ Visualized
tools
Botnet [182] DT 2018 √ √ √ Self
explainable
[183] DCNN 2022 √ √ √ √ √ SHAP
[184] ML 2022 √ √ √ √ √ SHAP
[187] Autoencoder, 2021 √ √ √ √ √ LIME and
NB, RF and DT SHAP
[189] RF, LGB, DT, 2021 √ √ √ √ Local
and LR features
[190] GNN 2022 √ √ √ √ √ GNN
Explainer
[193] Autoencoder 2021 √ √ √ √ √ Kernel
SHAP
Fraud
[194] Transfer Learning 2020 √ √ √ √ HEN
[195] Sequential 2019 √ √ √ Fraud
modeling Memory
[196] AP Clustering 2021 √ √ √ MIL
and LSTM
[197] Bi-LSTM and 2021 √ √ √ √ Feature
pHDBSCAN extraction
[199] MMHAM 2022 √ √ √ √ √ √ Self
explainable
[200] RF and SVM 2021 √ √ √ √ √ LIME and
EBM
Phishing [202] Phishpedia 2021 √ √ √ Visual
explanation
[203] NB, LR, RF, and 2021 √ √ √ Theoretical
SVM Perspective
[204] XGBoost and 2022 √ √ √ √ √ SHAP
autoencoder
[205] Neural network 2022 √ √ √ √ Self
and attention explainable
[206] DNN 2022 √ √ √ √ √ √ LIME,
SHAPE, and
RuleFit
[207] CNN, LSTM, and 2022 √ √ √ √ √ LIME and
XGBoost SHAP
[208] BiLSTM 2022 √ √ √ √ √ √ KHO,
LIME, and
Network SHAP
Intrusion [210] DNN 2021 √ √ √ √ √ √ √ EDA,
BRCG, and
CEM
[211] DNN 2021 √ √ √ √ √ √ √ SHAP,
LIME, and
BRCG
[212] DT 2021 √ √ √ √ √ Self
explainable
[213] RF 2021 √ √ √ √ √ SHAP
[215] Stacked RF 2022 √ √ √ √ √ SHAP
[219] CNN and RNN 2020 √ √ √ √ √ Clustering
and DT
Domain [220] RNN, CNN, 2021 √ √ √ √ EXPLAIN
Generation SVM, RF, and
Algorithms ResNet
(DGA) [222] ResNet 2020 √ √ √ √ Self
explainable
[225] XGBoost 2022 √ √ √ √ √ SHAP
Denial-of- [226] ML 2021 √ √ √ √ TCAV
Service [228] DNN 2018 √ √ √ √ √ √ DNN
(DoS) Explanation
Generator
In this subsection, we aim to present a comprehensive
Boryau et al. [225] introduced CSTITool, a overview of XAI studies for the cyber security of different
CICFlowMeter-based flow extraction to feature extraction to industrial areas, as shown in Figure 9. And the details of
enhance the performance of the ML DoS attack detection these XAI implementations for cyber security in distinct
model. CICFlowMeter translated the flow data from packets industries are shown in Table 7 as well.
for the model's training. The size of the data was significantly 1) XAI FOR CYBER SECURITY OF HEALTHCARE
reduced during this process, which decreased the need for The use of big data, cloud computing, and IoT creates a
data storage. Hacker attack data including Network Service modern, intelligent healthcare industry. The use of the
Scanning, Endpoint DoS, Brute Force, and Remote Access Internet of Things, cutting-edge manufacturing technologies,
Software from the dataset CIC-IDS2017 network flow data software, hardware, robots, sensors, and other sophisticated
of malware from the dataset CSTI-10 were utilized to train information technologies, improves data connectivity.
the XGBoost model. The outcome demonstrated that the Information and communication technology advancements
performance measurements can be enhanced by using the enhance the quality of healthcare by transforming
additional descriptive flow statistics produced by CSTITool. conventional healthcare organizations into smart healthcare
For instance, Rig’s Precision and Recall increased by 1.23% [229]. With the increasingly significant role of AI in
and 1.59% respectively. Moreover, XAI method SHAP was healthcare, there are growing concerns about the
deployed to further explore the relationship between vulnerabilities of the smart healthcare system. Smart
cyberattacks and network flow variables to better understand healthcare is a prime target for cybercrime for two main
how the model produced predictions. reasons: a vast supply of valuable data and its defenses are
In the context of DoS attack, Rendhir et al. [226] analyzed porous. Health information theft, ransomware attacks on
the strategic decisions based on the KDD99 dataset [227] hospitals, and potential attacks on implanted medical
with the XAI method of Testing with ConceptActivation equipment are all examples of cyber security breaches.
Vectors (TCAV). The approach investigates the connection Breaches can undermine smart healthcare systems, erode
between the strategic choice, autonomous agent's objective, patient trust, and endanger human life [230].
and dataset properties. TCAVQ scores are obtained from the XAI comes into the picture as the smart healthcare system
KDD99 dataset for various DoS attacks and regular traffic. demands transparency and explainability to decrease the
The relationship between the goal availability and the increasing vulnerabilities of the smart healthcare system due
strategies TerminateConnection and AllocateMoreResources to the increasingly connected mobile devices, more concern
is determined using the TCAVQ scores. In the event of for patients’ monitoring, and more mobile consumer devices.
cyberattacks, the analysis is performed to support the choice There are many studies currently on implementing the XAI
of the plan or, if necessary, a change in the strategy. framework to address the issue of privacy and security of the
Kasun et al. [228] described the framework for smart healthcare system.
explainable DNNs-based DoS anomaly detection in process Devam et al. [231] introduced a study based on the heart
monitoring. The user was given post-hoc explanations for disease dataset and illustrated why explainability techniques
DNN predictions in the framework that is currently being should be chosen when utilizing DL systems in the medical
used. Based on the DoS attack benchmark dataset NSL-KDD field. This study then suggested and described various
[105], experiments were implemented on several DNN example-based strategies, such as Anchors, Counterfactuals,
architectures, and it was found that on the test dataset, DNNs Integrated Gradients, Contrastive Explanation Method, and
were able to yield accuracies of 97%. Besides, according to Kernel Shapley, which are crucial for disclosing the nature of
experimental findings, while classified as DoS, DNNs could the model's black box and ensuring model accountability.
also provide a higher relevance to the number of connections, These XAI approaches were compared with two benchmark
connection frequency, and volume of data exchanged. XAI methods, LIME and SHAP, as well. It was concluded
Therefore, this framework improves human operators' that these discussed XAI approaches all explained how
confidence in the system by reducing the opaqueness of the different features contribute to the outputs of the model. They
DNN-based anomaly detector. are intuitive, which helps in the process of understanding
what the black box model thinks and explains the model's
B. XAI FOR CYBER SECURITY IN INDUSTRIAL behavior.
APPLICATIONS BrainGNN, an explainable graph neural network (GNN)
based framework to analyze functional magnetic resonance
images (fMRI) and identify neurological biomarkers was integration into IoT and AI-enabled smart city applications
proposed by Xiaoxiao et al. [232]. Motivated by the can help to address black-box model difficulties and offer
requirements for transparency and explainability in medical transparency and explainability components for making
image analysis, the proposed BrainGNN framework included useful data-driven decisions for smart city applications.
ROI-selection pooling layers (R-pool) that highlight Smart city applications are usually utilized in high-risk and
prominent ROIs (nodes in the graph) so that which ROIs are privacy-sensitive scenarios. Therefore, it is crucial to
crucial for prediction could be determined. By doing so, the establish an effective XAI approach to give authorities
advantage of the BrainGNN framework could be the additional information about the justification, implications,
allowance of users to interpret significant brain regions in potential throughput, and an in-depth explanation of
multiple ways. background procedures to aid in final decision-making [236].
The chain of reasoning behind Computer Aided Roland et al. [237] introduced a tree-based method
Diagnostics (CAD) is attracting attention to build trust in Gradient Boosted Regression Trees (GBRT) model in
CAD decisions from complicated data sources such as conjunction with the SHAP-value framework to identify and
electronic health records, magnetic resonance imaging scans, analyze major patterns of meteorological determinants of
cardiotocography, etc. To address this issue, Julian et al. [233] PM1 species and overall PM1 concentrations. SIRTA [238],
presented a new algorithm, Adaptive-Weighted High a ground-based atmospheric observatory dataset for cloud
Importance Path Snippets (Ada-WHIPS) to explain and aerosol was utilized to experiment and the location for
AdaBoost classification with logical and simple rules in the establishing this dataset was in the city of Paris. The findings
context of CAD-related data sets. The weights in the of this study show that shallow MLHs, cold temperatures,
individual decision nodes of the internal decision trees of the and low wind speeds play distinct roles during peak PM1
AdaBoost model are redistributed especially by Ada-WHIPS. events in winter. Under high-pressure synoptic circulation,
A single rule that dominated the model's choice is then northeastern wind input frequently intensifies these
discovered using a straightforward heuristic search of the conditions.
weighted nodes. Moreover, according to experiments on nine One of the most demanded bus lines of Madrid was
CAD-related data sets, Ada-WHIPS explanations typically analyzed by Leticia et al. in [239] to make the smart city
generalize better (mean coverage 15 percent to 68 percent) transport network more efficient by predicting bus passenger
than the state of the art while being competitive for demand. The proposed method created an interpretable
specificity. model from the Long Short Term Memory (LSTM) neural
A novel human-in-the-loop XAI system, XAI-Content network that enhances the generated XAI model's linguistic
based Image Retrieval (CBIR), was introduced by Deepak et interpretability without sacrificing precision using a surrogate
al. in [234] to retrieve video frames from minimally invasive model and the 2-tuple fuzzy linguistic model. The public
surgery (MIS) videos that are comparable to a query image transportation business can save money and energy by using
based on content. MIS video frames were processed using a passenger demand forecasting to plan its resources most
self-supervised DL algorithm to extract semantic features. effectively. This methodology can also be used in the future
The search results were then iteratively refined using an to forecast passenger demand for other forms of
iterative query refinement technique, which utilized a binary transportation (air, railway, marine).
classifier that has been trained online using user feedback on Georgios et al. [240] proposed explainable models for
relevance. The saliency map, which provided a visual early prediction of certification in Massive Open Online
description of why the system deems a retrieved image to be Courses (MOOCs) for Smart City Professionals. MOOCs
similar to the query image, was produced using an XAI have grown significantly over the past few years due to
technique. The proposed XAI-CBIR system was tested using Covid-19 and tend to become the most common type of
the publicly available Cholec80 dataset, which contains 80 online and remote higher education. Several ML
films of minimally invasive cholecystectomy procedures. classification techniques such as Adaptive Boosting,
2) XAI FOR CYBER SECURITY OF SMART CITIES Gradient Boosting, Extremely Randomized Trees, Random
As increasingly data-driven artificial intelligence services Forest, and Logistic Regression were utilized to build
such as IoT, blockchain, and DL are incorporated into corresponding predictive models using PyCaret. And the
contemporary smart cities, smart cities are able to offer XAI method SHAP summary plot was employed to the
intelligent services for energy, transportation, healthcare, and classifiers including LightGBM, GB, and RF. Furthermore,
entertainment to both city locals and visitors by real-time new classification models based only on the two most
environmental monitoring [235]. However, smart city important features in each step gained from the SHAP
applications not only gather a variety of information from summary plot. And the experimental results showed that the
people and their social circles that are sensitive to privacy, effectiveness of all methods was slightly improved for all
but also control municipal services and have an impact on metrics.
people's life, cyber security, cyber crime, and privacy 3) XAI FOR CYBER SECURITY OF SMART FARMING
problems about smart cities arise. To address this issue, XAI
Smart farming refers to the use of cutting-edge technology in extremely sensitive areas such as Money Laundering
agriculture, including IoT, robots, drones, sensors, and detection and Corporate Mergers and Acquisitions to not
geolocation systems. Big data, cloud computing, AI, and only have a highly accurate and robust model but also to be
augmented reality are the engines of smart farming as well. able to produce helpful justifications to win a user's faith in
However, the addition of several communication modules the automated system.
and AI models leaves the system open to cyber-security risks Swati et al. [246] proposed a belief-rule-based automated
and threats to the infrastructure for smart farming [241]. And AI decision-support system for loan underwriting (BRB).
cyber attacks can harm nations' economies that heavily rely This system can take into account human knowledge and can
on agriculture. However, due to the black box nature of most employ supervised learning to gain knowledge from prior
AI models, users cannot understand the connections between data. Factual and heuristic rules can both be accommodated
features. This is crucial when the system is designed to by BRB's hierarchical structure. The significance of rules
simulate physical farming events with socioeconomic effects triggered by a data point representing a loan application and
like evaporation [242]. Therefore, many researchers are the contribution of attributes in activated rules can both be
working on the implementation potentials of XAI applied in used to illustrate the decision-making process in this system.
smart farming cyber security. The textual supplied to rejected applicants as justification for
Nidhi et al. [242] presented an IoT and XAI-based declining requesters’ loan applications might have been
framework to detect plant diseases such as rust and blast in started by the progression of events from the factual-rule-
pearl millet. Parametric data from the pearl millet farmland at base to the heuristic-rule-base.
ICAR, Mysore, India was utilized to train the proposed A novel methodology for producing plausible
Custom-Net DL Models, reaching a classification accuracy counterfactual explanations for the Corporate Mergers and
of 98.78% which is similar to state-of-the-art models Acquisitions (M&A) Deep Transformers system was
including Inception ResNet-V2, Inception-V3, ResNet-50, presented by Linyi et al. [247]. The proposed transformer-
VGG-16, and VGG-19 and superior to them in terms of based classifier made use of the regularization advantages of
reducing the training time by 86.67%. Additionally, the adversarial training to increase model resilience. More
Grad-CAM is used to display the features that the Custom- significantly, a masked language model for financial text
Net extracted to make the framework more transparent and categorization that improved upon prior methods to measure
explainable. the significance of words and guarantee the creation of
To thoroughly assess the variables that can potentially credible counterfactual explanations was developed. When
explain why agricultural land is used for plantations of wheat, compared to state-of-art methods including SVM, CNN,
maize, and olive trees, Viana et al. [243] implemented an ML BiGRU, and HAN, the results show greater accuracy and
and agnostic-model approach to show global and local explanatory performance.
explanations of the most important variables. ML model An interactive, evidence-based method to help customers
Random Forest and XAI approach LIME were deployed for understand and believe the output produced by AI-enabled
analysis and approximately 140 variables related to algorithms was generated for analyzing customer
agricultural socioeconomic, biophysical, and bioclimatic transactions in the smart banking area by Ambreen [248]. A
factors were gathered. By applying the proposed framework, digital dashboard was created to make it easier to engage
it is found that the three crop plantations in the research area's with algorithm results and talk about how the suggested XAI
usage of agricultural land were explained by five major method can greatly boost data scientists' confidence in their
factors: drainage density, slope, soil type, and the ability to comprehend the output of AI-enabled algorithms.
ombrothermic index anomaly (for humid and dry years). In the proposed model, a Probabilistic Neural Network (PNN)
4) XAI FOR CYBER SECURITY OF SMART FINANCIAL was utilized to classify the multi-class scenario of bank
SYSTEM transaction classification.
The financial system has been rapidly altered by AI models, 5) XAI FOR CYBER SECURITY OF HUMAN-COMPUTER
which offer cost savings and improved operational efficiency INTERACTION (HCI)
in fields like asset management, investment advice, risk HCI enables people to comprehend and engage with
forecasting, lending, and customer service [244]. On one technology by establishing an effective channel of
hand, the ease of using AI in these smart financial systems communication. And HCI's primary goal is to create
provides efficiency for all parties involved, but on the other interactions that take users' wants and abilities into account
hand, the risk of cyberattacks on them is growing [249]. In the field of HCI, security and privacy have long
exponentially. Attackers have traditionally been motivated been significant research concerns, where Usable Security
primarily by money, making smart financial systems their top has arisen as an interdisciplinary research area. On the other
choice of target. To combat the finance crime targeting smart hand, HCI and AI emerge together in such a way that AI
financial systems, one of the primary priorities in the smart imitates human behavior to create intelligent systems,
financial domain should be the implementation of XAI [245]. whereas HCI tries to comprehend human behavior to modify
The reason behind this issue is that it is essential in these the machine to increase user experience, safety, and
efficiency. However, from an HCI standpoint, there is no past few years, AI has made significant progress in providing
assurance that an AI system's intended users will be able to effective performance in smart transportation systems, the
comprehend it. And according to the user-centered design XAI methods are still required as XAI could make it possible
(UCD), a design must offer an understandable AI that cyber- for the smart transportation system to monitor transportation
attacks the requirements and skills of the intended users (e.g., details such as drivers’ behaviour, accicent causes, and
knowledge level). Therefore, the final objective of XAI in vechicles’ conditions.
HCI should be to guarantee that target users can comprehend A ML approach to detect misbehaving vehicles in the
the outcomes, assisting them in becoming more efficient Vehicular Adhoc Networks (VANET) was proposed by
decision-makers [250]. Harsh et al. [256]. In the smart VANET, the performance of
Gaur et al. [251] utilized XAI methods including LIME each vehicle depends upon the information from other
and SHAP in conjunction with ML algorithms including autonomous vehicles (AVs). Therefore, the misinformation
Logistic Regression(80.87%), Support Vector from misbehaving vehicles would damage the entire VANET
Machine(85.8%), K-nearest Neighbour(87.24%), Multilayer as a whole and detecting misbehaving would be significant to
Perceptron(91.94%), and Decision Tree(100%) to build a build a stable and safe VANET system. Vehicular reference
robust explainable HCI model for examining the mini-mental misbehavior (VeReMi) dataset [257] was utilized in an
state for Alzheimer’s disease. It is worth mentioning that the ensemble learning using Random Forest algorithm and a
most significant features contributing to the Alzheimer's decision tree-based algorithm and accuracy and F1 score of
disease examing were different for the LIME-based 98.43% and 98.5% were achieved respectively.
framework and the SHAP-based framework. In contrast to Shideh et al. [258] described a transportation energy
nWBV's dominance of the LIME features, MMSE makes a model (TEM) that forecasts home transportation energy use
significant contribution to Shapely values. using XAI technique LIME. Data from Household Travel
To fill the gap few publications on artistic image Survey (HTS), which is utilized to train the artificial neural
recommendation systems give an understanding of how users network accurately, has been deployed in TEM and high
perceive various features of the system, including domain validation accuracy (83.4%) was developed. For certain
expertise, relevance, explainability, and trust, Vicente et al. traffic analysis zones (TAZs), the significance and impact
[252] examed several aspects of the user experience with a (local explanation) of HTS inputs (such as household travel,
recommender system of artistic photos from algorithmic and demographics, and neighborhood data) on transportation
HCI perspectives. Three different recommender interfaces energy consumption are studied. The explainability of the
and two different Visual Content-based Recommender proposed TEM framework can help the home transportation
(VCBR) algorithms were employed in this research. energy distribution in two ways, including describing the
Q. Vera et al. [253] presented a high-level introduction of local inference mechanisms on individual (household)
the XAI algorithm's technical environment, followed by a predictions and assessing the model's level of confidence can
selective examination of current HCI works that use human- be done using a broad grasp of the model.
centered design, evaluation, and provision of conceptual and C. Bustos et al. [259] provided an automated scheme for
methodological tools for XAI. Human-centered XAI was reducing traffic-related fatalities by utilizing a variety of
highlighted in this research, and the emerged research Computer Vision techniques (classification, segmentation,
communities of human-centered XAI were introduced in the and interpretability techniques). An explainability analysis
context of HCI. based on image segmentation and class activation mapping
6) XAI FOR CYBER SECURITY OF SMART on the same images, as well as an adaptation and training of a
TRANSPORTATION Residual Convolutional Neural Network to establish a danger
The emergence of cutting-edge technologies including index for each specific urban scene, are all steps in this
software-defined networks (SDNs), IIoT, Blockchain, AI, process. This computational approach results in a fine-
and vehicular ad hoc networks (VANETs) has increased grained map of risk levels across a city as well as a heuristic
operational complexity while smoothly integrating smart for identifying potential measures to increase both pedestrian
transportation systems [254]. However, it can experience and automobile safety.
security problems that leave the transportation systems open
to intrusion. In addition, security concerns in transportation C. CYBER THREATS TARGETING XAI AND DEFENSIVE
technology affect the AI model [255]. Major transportation APPROACHES
infrastructures such as Wireless Sensor Networks (WSN), In the above sections, the applications of XAI in different
Vehicle-to-everything communication (V2X), VMS, and areas to defend against different cyber threats have been
Traffic Signal Controllers (TSC) have either already been discussed. Nevertheless, although XAI could be effective in
targeted or are still susceptible to hacking. To defend against protecting other areas and models by providing transparency
these cyber attacks and prevent the potential cyber threats on and explainability, XAI models themselves would face cyber
the smart transportation system, AI-enabled intrusion threats as well. Both the AI models deployed and the
detection systems are introduced recently. Although In the explainability part could be vulnerable to cyber attacks.
Some cyber attackers even utilize the explainable Extensive experimental testing using real data from the
characteristics to attack the XAI model. Therefore, we deem criminal justice and credit scoring fields showed that the
it necessary to review the cyber threats targeting XAI and proposed fooling method was successful in producing
corresponding defensive approaches against them in this adversarial classifiers that can trick post-hoc explanation
review. procedures, including LIME and SHAP, with LIME being
Apart from the different parts that conventional AI models found to be more susceptible than SHAP. In detail, it was
need to protect, including samples, learning models, and the demonstrated how highly biased (racist) classifiers created by
interoperation processes, the explainable part of XAI-based the proposed fooling framework can easily deceive well-
models should be paid attention to as well. The following liked explanation techniques like LIME and SHAP into
researches describe some cyber attacks targeting XAI models producing innocent explanations which do not reflect the
using different approaches from different perspectives. underlying biases using extensive evaluation with numerous
A novel black box attack was developed by Aditya et al. real-world datasets (including COMPAS [264]).
[260] to examine the consistency, accuracy, and confidence Simple, model-agnostic, and intrinsic Gradient-based NLP
security characteristics of gradient-based XAI algorithms. explainable approaches are considered faithful compared
The proposed black box attack focused on two categories of with other state-of-art XAI approaches including SHAP and
attack: CI and I attack. While I attack attempts to attack the LIME. However, Junlin et al. [265] show how the gradients-
single explainer without affecting the classifier's prediction based explanation methods can be fooled by creating a
given a natural sample, the CI attack attempts to FACADE classifier that could be combined with any
simultaneously compromise the integrity of the underlying particular model having deceptive gradients. Although the
classifier and explainer. It is demonstrated that the gradients in the final model are dominated by the customized
effectiveness of the attack on various gradient-based FACADE model, the predictions are comparable to those of
explainers as well as three security-relevant data sets and the original model. They also demonstrated that the proposed
models through empirical and qualitative evaluation. method can manipulate a variety of gradient-based analysis
Thi-Thu-Huong et al. [261] proposed a robust adversarial methods: saliency maps, input reduction, and adversarial
image patch (AIP) that alters the causes of interpretation perturbations all misclassify tokens as being very significant
model prediction outcomes and leads to incorrect deep neural and of low importance.
networks (DNNs) model predictions, such as gradient- On the other hand, to defend against these cyber threats
weighted class activation mapping. Four tests pertaining to targeting XAI models, researchers also developed several
the suggested methodology were carried out on the ILSVRC defensive approaches, divided into three main categories:
image dataset. There are two different kinds of pre-trained modifying the training process and input data, modifying the
models (i.e., feature and no feature layer). The Visual model network, and sing auxiliary tools.
Geometry Group 19-Batch Normalization (VGG19-BN) and Gintare et al. [266] assessed how JPG compression affects
Wide Residual Networks models, in particular, were used to the categorization of adversarial images. Experimental tests
test the suggested strategy (Wide ResNet 101). Two more demonstrated that JPG compression could undo minor
pre-trained models: Visual Geometry Group 19 (VGG19) adversarial perturbations brought forth by the Fast-Gradient-
and Residual Network (ResNext 101 328d), were also Sign technique. JPG compression could not undo the
deployed whereas masks and heatmaps from Grad-CAM adversarial perturbation, nevertheless, if the perturbations are
results were utilized to evaluate the results. more significant. In this situation, neural network classifiers'
Tamp-X, a unique approach that manipulates the strong inductive bias cause inaccurate yet confident
activations of powerful NLP classifiers was suggested by misclassifications.
Hassan et al. [262], causing cutting-edge white-box and Ji et al. [267] present DeepCloak, a defense technique.
black-box XAI techniques to produce distorted explanations. DeepCloak reduces the capacity an attacker may use to
Two steps were carried out to evaluate state-of-art XAI generate adversarial samples by finding and eliminating
methods, including the white-box InteGrad andSmoothGrad, pointless characteristics from a DNN model, increasing the
and the black-box—LIME and SHAP. The first step was to robustness against such adversarial attacks. In this work, the
randomly mask keywords and observe their impact on NLP mask layer, inserted before processing the DNN model,
classifiers whereas the second step was to tamper with the encoded the discrepancies between the original images and
activation functions of the classifiers and evaluate the outputs. related adversarial samples, as well as between these images
Additionally, three cutting-edge adversarial assaults were and the output features of the preceding network model layer.
utilized to test the tampered NLP classifiers and it was found Pouya et al. [268] Defense-GAN, a novel defense
that the adversarial attackers have a much tougher time technique leveraging GANs to strengthen the resilience of
fooling the tampered classifiers. classification models against adversarial black-box and
Slack et al. [263] provided a unique scaffolding method white-box attacks. The proposed approach was demonstrated
that, by letting an antagonistic party create any explanation to be successful against the majority of frequently thought-of
they want, effectively masks the biases of any given classifier. attack tactics without assuming a specific assault model. On
two benchmark computer vision datasets, we empirically cyber security domains, measurements to evaluate the
demonstrate that Defense-GAN consistently offers accuracy and completeness of explanations from the XAI
acceptable defense while other approaches consistently systems are required. In general, the evaluation
struggled against at least one sort of assault. measurements of XAI systems should be able to assess the
quality, value, and satisfaction of explanations, the
VI. ANALYSIS AND DISCUSSION enhancement of the users’ mental model brought about by
A. CHALLENGES OF USING XAI FOR CYBER model explanations, and the impact of explanations on the
SECURITY effectiveness of the model as well as on the users’ confidence
We have reviewed the state-of-art XAI techniques utilized in and reliance. Unfortunately, the findings derived from the
the defense of different cyber attacks and the protection of above reviews of this survey demonstrate the challenge that:
distinct industrial cyber security domains. It is noticeable that more generic, quantifiable XAI system evaluation
although XAI could be a powerful tool in the application of measurements are required to support the community's
different cyber security domains, XAI faces certain suggested XAI explainability measuring techniques and tools.
challenges in its application of cyber security. And in this Popular XAI explanation evaluation measurements can be
section, we will discuss these challenges. divided into two main categories: user satisfaction and
1) DATASETS computational measurements. However, user satisfaction-
An overview of the famous and commonly used datasets of based evaluation approaches are dependent on user feedback
different cyber attacks and distinct industries was provided in or interview, which may cause privacy issues for many cyber
Table 4 and Table 5 respectively. However, there is a severe security problems. On the other hand, for computational
issue with the most used cyber security datasets, i.e. many measurements, many researchers utilize inherently
datasets are not updated in certain directions. This interpretable models [56] (e.g., linear regression and decision
phenomenon may be caused by privacy and ethical issues. trees) to compare with the generated explanations.
Therefore, the most recent categories of cyber attacks were Nevertheless, there are no benchmark comparison models for
not included in the public cyber attack datasets, which would this evaluation approach, and the users’ understanding of the
lead to inefficiency in the training of the XAI applications in explanation could not be reflected. Besides, the XAI
the establishment of cyber attack defensive mechanisms. evaluation systems lack measurements focusing on some
Although the industrial datasets in areas such as healthcare, other significant factors of the cyber security domain
smart agriculture, and smart transportation include more including computational resources as well as computational
recent samples than the datasets for cyber attacks, these power. In conclusion, it is necessary to take into account a set
datasets should be updated as well because cyber attacks are of agreed-upon standard explainability evaluation metrics for
becoming more sophisticated and diverse these days. comparison to make future improvements for XAI
Another issue with the currently available datasets is that applications in cyber security.
these datasets usually lack a large volume of data available 3) CYBER THREATS FACED BY XAI MODELS
for the training of XAI methods, which will decrease both the As we discussed in Section V, although XAI methods can
performance and the explainability of the XAI approaches. provide transparency and explainability to AI-enabled
Another reason behind this situation is that some of the systems to prevent cyber threats, the current XAI models are
information related to cyber attacks and cyber industries is facing many cyber attacks targeting the vulnerabilities of the
redundant and unbalanced. Other than that, the heterogeneity explanation approaches, which is extremely dangerous for
of samples collected in these datasets is a challenge for the the cyber security systems as they always require a high level
XAI models as well. The number of features and categories of safety. For instance, many researchers [263] [264] have
varies for each dataset and some datasets are composed of proved the fact that it is possible to fool some of the most
human-generated cyber attacks rather than exhibiting real- popular XAI explanation methods such as LIME and SHAP,
world and latest attacks. These problems highlight the which are also frequently deployed in the XAI application of
challenge that the most recent benchmark datasets with a cyber security areas. It is demonstrated that the explanations
massive amount of data for training and testing and a generating processes of those state-of-art XAI methods might
balanced and equal number of attack categories are still to be be counter-intuitive. Other than that, in the practical
identified. industrial cyber security domains, such as XAI-enabled face
2) EVALUATION authentication systems. Although in Section V, we have
Evaluation measure for XAI systems is another important discussed several defensive methods against cyber threats
factor in the application of XAI approaches for cyber security. targeting XAI systems, most defensive approaches focus on
When evaluating the performance of the established XAI- the protection of the performance of the prediction results of
based cyber security systems, several conventional XAI models rather than the explanation results. However, for
evaluation metrics including F1-Score, Precision, and ROC XAI-based cyber security systems, the explainability of the
could be utilized to measure the performance of the proposed models is significant to maintain the transparency and
mechanisms. However, when applying XAI methods in the
efficiency of the entire system and prevent the cyber attacks algorithms in different cyber security tasks would
as well. influence the performance and explainability of
4) PRIVACY AND ETHICAL ISSUES XAI models significantly. Other than that, the
In addition to the aforementioned technical challenges, tuning process of parameters and model structures
privacy and ethical issues are also crucial challenges when of the established XAI model is another crucial
implementing XAI in cyber security. During the system life consideration as well.
cycle, XAI models must explicitly take privacy concerns into 4) The model defense could be highlighted in
account. It is commonly agreed that respecting every person's particular for cyber security tasks as they are the
right to privacy is essential, especially in some very sensitive main targets for cyber attackers. Especially for
areas of cyber security, for instance, authentication, e-mails, XAI-based cyber security mechanisms, the decision
and password. Moreover, XAI systems naturally fall within model, security data as well as the explanation
the general ethical concern of potential discrimination (such process should be protected to prevent cyber threats.
as racism, sexism, and ageism) by AI systems. In theory, 5) Privacy awareness is another insight that XAI
identical biases may be produced by any AI model that is methods could provide for the cyber security system.
built using previously collected data from humans. It is Giving end users of cyber security systems a way to
important to take precautions to ensure that there is no evaluate their data privacy is a significant objective
discrimination, bias, or unfairness in the judgments made by in the application of XAI. End-users could learn
the XAI system and the explanations that go along with them. through XAI explanations about what user data is
The ethical bias of XAI systems should be eliminated in used in algorithmic decision-making.
terms of justification as well as explainability, in particular in
specific domains of cyber security applications. For privacy C. FUTURE RESEARCH DIRECTIONS
issues, because the data are gathered from security-related 1) HIGH-QUALITY DATASETS
sources, the privacy and security-related concerns increase. The quantity and quality of the available datasets have a
Therefore, it is essential to guarantee that data and models are significant impact on how well XAI methods work for the
protected from adversarial assaults and being tampered with cyber security system, and the biases and constraints of the
by unauthorized individuals, which means that only datasets used to train the models have an impact on how
authorized individuals should be permitted access to XAI accurate the decisions and explanations are. On the other
models. hand, as we discussed in the above sections, the existing
available cyber security datasets could not reflect the most
B. KEY INSIGHTS LEARNED FROM USING XAI FOR recent cyber attacks due to privacy and ethical issues. Data
CYBER SECURITY
from real networks or the Internet typically contain sensitive
In this section, some key insights learned from using XAI for
information, such as personal or business details, and if made
cyber security will be discussed based on the review in the
publicly available, they may disclose security flaws in the
above sections. The main insights for the XAI
network from which they originated. Additionally, the
implementation in cyber security systems can be itemized as
imbalance of both volumes and features of the datasets would
follows:
influence the establishment of the XAI-based cyber security
1) User trust and reliance should be satisfied. By system negatively as well. Therefore, the construction of both
offering explanations, an XAI system can increase high-quality and up-to-date datasets available for XAI
end users' trust in the XAI-based cyber security applications for cyber security could be a possible future
system. Users of an XAI system can test their research direction.
perception of the system's correctness and reliability.
2) TRADE-OFF BETWEEN PERFORMANCE AND
Users become dependent on the system as a result EXPLAINABILITY
of their trust in the XAI-based cyber security It is essential for cyber security experts to maintain the trade-
system. off between performance and explainability aspects of the
2) Model visualization and inspection should be newly introduced XAI-enabled cyber security systems. It is
considered. Cyber security experts could benefit noticeable that although for some self-explainable XAI
from XAI system visualization and explainability to approaches, for instance, Decision Tree, the model is quite
inspect model uncertainty and trustworthiness. transparent and users could understand the decision-making
Additionally, identifying and analyzing XAI model process easier, the performance of those approaches could
and system failure cases is another crucial not always be satisfying. On the other hand, the AI
component of model visualization and inspection. algorithms that now often perform best (for example, DL) are
3) Model tuning and selection are crucial factors to the least explainable, causing a demand for explainable
ensure the efficiency of the XAI model models that can achieve high performance. Some researchers
implemented in cyber security. Selecting different have exploited this area, including authors of [269]
explanation approaches for distinct ML or DL
significantly reduce the trade-off between efficiency and conflict between using big data for security and safeguarding
performance by introducing XAI for DNN into existing it. Data must be guaranteed to be safe from adversarial
quantization techniques. And authors of [270] demonstrated assaults and manipulation by unauthorized users and
that the wavelet modifications provided could lead to legitimate users should also be able to access the data.
significantly smaller, simplified, more computationally Therefore, the protection of data and generated explanations
efficient, and more naturally interpretable models, while of XAI systems could be a future research direction as well.
simultaneously keeping performance. However, there is a
lack of research focusing on the trade-off of performance and VII. CONCLUSION
explainability of XAI approaches applied in cyber security. XAI is a powerful framework to introduce explainability and
3) USER-CENTERED XAI transparency to the decisions of conventional AI models
The human understandability of XAI approaches has become including DL and ML. On the other hand, cyber security is
the focus of some recent studies to find new potential for its an area where transparency and explainability are required to
application in areas of cyber security. As we mentioned in defend against cyber security threats and analyze generated
the above sections, user satisfaction with the generated security decisions. Therefore, in this paper, we presented a
explanation is a significant component of the XAI comprehensive survey of state-of-art research regarding XAI
approaches to explainability evaluation. However, in areas of for cyber security applications. We concluded the basic
cyber security, the questionnaire and feedback of users are principles and taxonomies of state-of-art XAI models with
limited to some degree due to security concerns. Therefore, essential tools, such as a general framework and available
how to generate user-centered XAI systems for cyber datasets. We also investigated the most advanced XAI-based
security end users in terms of user understanding, user cyber security systems from different perspectives of
satisfaction, and user performance without violating the application scenarios, including XAI applications in
security issues could be a future research direction. defending against different categories of cyber attacks, XAI
for cyber security in distinct industrial applications, and
4) MULTIMODAL XAI
cyber threats targeting XAI models and corresponding
Multimodal information of text, video, audio, and images in
defensive approaches. Some common cyber attacks including
the same context can all be easily understood by people. The
malware, spam, fraud, DoS, DGAs, phishing, network
benefit of multimodality is its capacity to gather and combine
intrusion, and botnet were introduced. The corresponding
important and comprehensive data from a range of sources,
defensive mechanisms utilizing XAI against them were
enabling a far richer depiction of the issue at hand. In some
presented. The implementation of XAI in various industrial
cyber security industrial areas, such as healthcare, medical
areas namely in smart healthcare, smart financial systems,
decisions are primarily driven by a variety of influencing
smart agriculture, smart cities, smart transportation, and
variables originating from a plurality of underlying signals
Human-Computer Interaction were described exhausively.
and information bases, which highlights the need for
Distinct approaches of cyber attacks targeting XAI models
multimodality at every stage. On the other hand, due to the
and the related defensive methods were introduced as well.
application of XAI in these areas, multimodal XAI could be
In continuation to these, we pointed out and discussed some
developed in near future.
challenges, key insights and research directions of XAI
5) ADVERSARIAL ATTACKS AND DEFENSES applications in cyber security. We hope that this paper could
As we discussed in this review, although XAI could be serve as a reference for researchers, developers, and security
applied in cyber security to prevent cyber attacks, the XAI professionals who are interested in using XAI models to
model performance and explainability could be attacked as solve challenging issues in cyber security domains.
well. Other than that, the adversarial inputs to the sample
data should be paid attention to as well. Some researchers
[263] have already developed powerful tools to fool the state- REFERENCES
of-art XAI methods including LIME and SHAP. However, [1] CISA, “What is Cybersecurity? | CISA,” What is Cybersecurity?
although the cyber threats and corresponding defensive https://www.cisa.gov/uscert/ncas/tips/ST04-001 (accessed Jul. 01,
2022).
mechanisms focusing on the performance of AI models have [2] D. S. Berman, A. L. Buczak, J. S. Chavis, and C. L. Corbett, “A
been studied recently, the adversarial attacks and defenses Survey of Deep Learning Methods for Cyber Security,” Information,
against the explainability of XAI models still require further vol. 10, no. 4, Art. no. 4, Apr. 2019, doi: 10.3390/info10040122.
[3] “Number of internet users worldwide 2021,” Statista.
research. https://www.statista.com/statistics/273018/number-of-internet-users-
6) PROTECTION OF DATA worldwide/ (accessed Jul. 01, 2022).
[4] “2021 Cyber Attack Trends Mid-Year Report | Check Point
In cyber security areas, confidentiality and protection of data
Software.” https://pages.checkpoint.com/cyber-attack-2021-
are significant issues as privacy and ethical issues are trends.html (accessed Jul. 01, 2022).
highlighted recently. For XAI-based systems, the situation is [5] “Cyberattack disrupts unemployment benefits in some states,”
even more severe as both the decisions and the explanations Washington Post. Accessed: Jul. 02, 2022. [Online]. Available:
https://www.washingtonpost.com/politics/cyberattack-disrupts-
related to users should be preserved. As a result, there is a
unemployment-benefits-in-some-states/2022/06/30/8f8fe138-f88a- [24] J. Li, “Cyber security meets artificial intelligence: a survey,”
11ec-81db-ac07a394a86b_story.html Frontiers Inf Technol Electronic Eng, vol. 19, no. 12, pp. 1462–1474,
[6] “Threat Landscape,” ENISA. Dec. 2018, doi: 10.1631/FITEE.1800573.
https://www.enisa.europa.eu/topics/threat-risk-management/threats- [25] I. Ahmed, G. Jeon, and F. Piccialli, “From Artificial Intelligence to
and-trends (accessed Jul. 02, 2022). Explainable Artificial Intelligence in Industry 4.0: A Survey on
[7] D. Gümüşbaş, T. Yıldırım, A. Genovese, and F. Scotti, “A What, How, and Where,” IEEE Transactions on Industrial
Comprehensive Survey of Databases and Deep Learning Methods Informatics, vol. 18, no. 8, pp. 5031–5042, 2022, doi:
for Cybersecurity and Intrusion Detection Systems,” IEEE Systems 10.1109/TII.2022.3146552.
Journal, vol. 15, no. 2, pp. 1717–1731, Jun. 2021, doi: [26] A. Kuppa and N.-A. Le-Khac, “Adversarial XAI Methods in
10.1109/JSYST.2020.2992966. Cybersecurity,” IEEE Transactions on Information Forensics and
[8] S. Zeadally, E. Adi, Z. Baig, and I. A. Khan, “Harnessing Artificial Security, vol. 16, pp. 4924–4938, 2021, doi:
Intelligence Capabilities to Improve Cybersecurity,” IEEE Access, 10.1109/TIFS.2021.3117075.
vol. 8, pp. 23817–23837, 2020, doi: [27] S. Mane and D. Rao, “Explaining Network Intrusion Detection
10.1109/ACCESS.2020.2968045. System Using Explainable AI Framework.” arXiv, Mar. 12, 2021.
[9] S. M. Mathews, “Explainable Artificial Intelligence Applications in doi: 10.48550/arXiv.2103.07110.
NLP, Biomedical, and Malware Classification: A Literature [28] “Survey of AI in Cybersecurity for Information Technology
Review,” in Intelligent Computing, Cham, 2019, pp. 1269–1292. doi: Management | IEEE Conference Publication | IEEE Xplore.”
10.1007/978-3-030-22868-2_90. https://ieeexplore.ieee.org/document/8813605 (accessed Jul. 05,
[10] “Explainable Artificial Intelligence for Tabular Data: A Survey | 2022).
IEEE Journals & Magazine | IEEE Xplore.” [29] S. Mahdavifar and A. A. Ghorbani, “Application of deep learning to
https://ieeexplore.ieee.org/document/9551946 (accessed Jul. 02, cybersecurity: A survey,” Neurocomputing, vol. 347, pp. 149–176,
2022). Jun. 2019, doi: 10.1016/j.neucom.2019.02.056.
[11] B. Goodman and S. Flaxman, “European Union Regulations on [30] G. Srivastava et al., “XAI for Cybersecurity: State of the Art,
Algorithmic Decision-Making and a ‘Right to Explanation,’” AI Challenges, Open Issues and Future Directions.” arXiv, Jun. 02,
Magazine, vol. 38, no. 3, Art. no. 3, Oct. 2017, doi: 2022. doi: 10.48550/arXiv.2206.03585.
10.1609/aimag.v38i3.2741. [31] “AI-Driven Cybersecurity: An Overview, Security Intelligence
[12] “A Systematic Review of Human–Computer Interaction and Modeling and Research Directions | SpringerLink.”
Explainable Artificial Intelligence in Healthcare With Artificial https://link.springer.com/article/10.1007/s42979-021-00557-0
Intelligence Techniques | IEEE Journals & Magazine | IEEE (accessed Jul. 05, 2022).
Xplore.” https://ieeexplore.ieee.org/document/9614151 (accessed Jul. [32] M. Humayun, M. Niazi, N. Jhanjhi, M. Alshayeb, and S. Mahmood,
02, 2022). “Cyber Security Threats and Vulnerabilities: A Systematic Mapping
[13] H. Jiang, J. Nagra, and P. Ahammad, “SoK: Applying Machine Study,” Arab J Sci Eng, vol. 45, no. 4, pp. 3171–3189, Apr. 2020,
Learning in Security - A Survey,” Nov. 2016. doi: 10.1007/s13369-019-04319-2.
[14] “A Survey of Data Mining and Machine Learning Methods for [33] K. Shaukat, S. Luo, V. Varadharajan, I. A. Hameed, and M. Xu, “A
Cyber Security Intrusion Detection | IEEE Journals & Magazine | Survey on Machine Learning Techniques for Cyber Security in the
IEEE Xplore.” https://ieeexplore.ieee.org/document/7307098 Last Decade,” IEEE Access, vol. 8, pp. 222310–222354, 2020, doi:
(accessed Jul. 05, 2022). 10.1109/ACCESS.2020.3041951.
[15] D. Kwon, H. Kim, J. Kim, S. C. Suh, I. Kim, and K. J. Kim, “A [34] A. Bécue, I. Praça, and J. Gama, “Artificial intelligence, cyber-
survey of deep learning-based network anomaly detection,” Cluster threats and Industry 4.0: challenges and opportunities,” Artif Intell
Comput, vol. 22, no. 1, pp. 949–961, Jan. 2019, doi: Rev, vol. 54, no. 5, pp. 3849–3886, Jun. 2021, doi: 10.1007/s10462-
10.1007/s10586-017-1117-8. 020-09942-2.
[16] A. P. Veiga, “Applications of Artificial Intelligence to Network [35] I. Kok, F. Y. Okay, O. Muyanli, and S. Ozdemir, “Explainable
Security.” arXiv, Mar. 27, 2018. doi: 10.48550/arXiv.1803.09992. Artificial Intelligence (XAI) for Internet of Things: A Survey.”
[17] D. Ucci, L. Aniello, and R. Baldoni, “Survey of machine learning arXiv, Jun. 07, 2022. doi: 10.48550/arXiv.2206.04800.
techniques for malware analysis,” Computers & Security, vol. 81, pp. [36] M. Macas, C. Wu, and W. Fuertes, “A survey on deep learning for
123–147, Mar. 2019, doi: 10.1016/j.cose.2018.11.001. cybersecurity: Progress, challenges, and opportunities,” Computer
[18] P. Mishra, V. Varadharajan, U. Tupakula, and E. S. Pilli, “A Networks, vol. 212, p. 109032, Jul. 2022, doi:
Detailed Investigation and Analysis of Using Machine Learning 10.1016/j.comnet.2022.109032.
Techniques for Intrusion Detection,” IEEE Communications Surveys [37] 14:00-17:00, “ISO/IEC 27032:2012,” ISO.
& Tutorials, vol. 21, no. 1, pp. 686–728, 2019, doi: https://www.iso.org/cms/render/live/en/sites/isoorg/contents/data/sta
10.1109/COMST.2018.2847722. ndard/04/43/44375.html (accessed Jul. 05, 2022).
[19] C. Rudin, “Stop Explaining Black Box Machine Learning Models [38] “What is a Cyber Attack?,” Check Point Software.
for High Stakes Decisions and Use Interpretable Models Instead,” https://www.checkpoint.com/cyber-hub/cyber-security/what-is-
arXiv:1811.10154 [cs, stat], Sep. 2019, Accessed: Apr. 28, 2022. cyber-attack/ (accessed Jul. 05, 2022).
[Online]. Available: http://arxiv.org/abs/1811.10154 [39] “Cybersecurity Market worth $345.4 billion by 2026.”
[20] Z. Lv, Y. Han, A. K. Singh, G. Manogaran, and H. Lv, https://www.marketsandmarkets.com/PressReleases/cyber-
“Trustworthiness in Industrial IoT Systems Based on Artificial security.asp (accessed Jul. 05, 2022).
Intelligence,” IEEE Transactions on Industrial Informatics, vol. 17, [40] M.-A. Clinciu and H. Hastie, “A Survey of Explainable AI
no. 2, pp. 1496–1504, 2021, doi: 10.1109/TII.2020.2994747. Terminology,” in Proceedings of the 1st Workshop on Interactive
[21] C. S. Wickramasinghe, D. L. Marino, K. Amarasinghe, and M. Natural Language Technology for Explainable Artificial Intelligence
Manic, “Generalization of Deep Learning for Cyber-Physical System (NL4XAI 2019), 2019, pp. 8–13. doi: 10.18653/v1/W19-8403.
Security: A Survey,” in IECON 2018 - 44th Annual Conference of [41] O. Biran and C. V. Cotton, “Explanation and Justification in
the IEEE Industrial Electronics Society, 2018, pp. 745–751. doi: Machine Learning : A Survey Or,” undefined, 2017, Accessed: Jul.
10.1109/IECON.2018.8591773. 08, 2022. [Online]. Available:
[22] P. A. A. Resende and A. C. Drummond, “A Survey of Random https://www.semanticscholar.org/paper/Explanation-and-
Forest Based Methods for Intrusion Detection Systems,” ACM Justification-in-Machine-Learning-%3A-Biran-
Comput. Surv., 2018, doi: 10.1145/3178582. Cotton/02e2e79a77d8aabc1af1900ac80ceebac20abde4
[23] R. A. Alves and D. Costa, “A Survey of Random Forest Based [42] T. Speith, “A Review of Taxonomies of Explainable Artificial
Methods for Intrusion Detection Systems,” ACM Computing Surveys Intelligence (XAI) Methods,” in 2022 ACM Conference on Fairness,
(CSUR), May 2018, doi: 10.1145/3178582. Accountability, and Transparency, New York, NY, USA, Jun. 2022,
pp. 2239–2250. doi: 10.1145/3531146.3534639.
[43] S. Han, M. Xie, H.-H. Chen, and Y. Ling, “Intrusion Detection in [62] S. Wachter, B. Mittelstadt, and C. Russell, “Counterfactual
Cyber-Physical Systems: Techniques and Challenges,” IEEE Explanations Without Opening the Black Box: Automated Decisions
Systems Journal, vol. 8, no. 4, pp. 1052–1062, 2014, doi: and the GDPR.” Rochester, NY, Oct. 06, 2017. doi:
10.1109/JSYST.2013.2257594. 10.2139/ssrn.3063289.
[44] R. Donida Labati, A. Genovese, V. Piuri, F. Scotti, and S. [63] M. Ibrahim, M. Louie, C. Modarres, and J. Paisley, “Global
Vishwakarma, “Computational Intelligence in Cloud Computing,” in Explanations of Neural Networks: Mapping the Landscape of
Recent Advances in Intelligent Engineering: Volume Dedicated to Predictions.” arXiv, Feb. 06, 2019. doi: 10.48550/arXiv.1902.02384.
Imre J. Rudas’ Seventieth Birthday, L. Kovács, T. Haidegger, and A. [64] H. Liu, Q. Yin, and W. Y. Wang, “Towards Explainable NLP: A
Szakál, Eds. Cham: Springer International Publishing, 2020, pp. Generative Explanation Framework for Text Classification.” arXiv,
111–127. doi: 10.1007/978-3-030-14350-3_6. Jun. 11, 2019. doi: 10.48550/arXiv.1811.00196.
[45] R. A. Nafea and M. Amin Almaiah, “Cyber Security Threats in [65] M. Danilevsky, K. Qian, R. Aharonov, Y. Katsis, B. Kawas, and P.
Cloud: Literature Review,” in 2021 International Conference on Sen, “A Survey of the State of Explainable AI for Natural Language
Information Technology (ICIT), Jul. 2021, pp. 779–786. doi: Processing.” arXiv, Oct. 01, 2020. doi: 10.48550/arXiv.2010.00711.
10.1109/ICIT52682.2021.9491638. [66] J. V. Jeyakumar, J. Noor, Y.-H. Cheng, L. Garcia, and M. Srivastava,
[46] “Black Box Attacks on Explainable Artificial Intelligence(XAI) “How Can I Explain This to You? An Empirical Study of Deep
methods in Cyber Security | IEEE Conference Publication | IEEE Neural Network Explanation Methods,” in Advances in Neural
Xplore.” https://ieeexplore.ieee.org/abstract/document/9206780 Information Processing Systems, 2020, vol. 33, pp. 4211–4222.
(accessed Jul. 08, 2022). Accessed: Jul. 09, 2022. [Online]. Available:
[47] K. D. Ahmed and S. Askar, “Deep Learning Models for Cyber https://proceedings.neurips.cc/paper/2020/hash/2c29d89cc56cdb191
Security in IoT Networks: A Review,” International Journal of c60db2f0bae796b-Abstract.html
Science and Business, vol. 5, no. 3, pp. 61–70, 2021. [67] W. Jin, X. Li, and G. Hamarneh, “Evaluating Explainable AI on a
[48] J. Gerlings, A. Shollo, and I. Constantiou, “Reviewing the Need for Multi-Modal Medical Imaging Task: Can Existing Algorithms
Explainable Artificial Intelligence (xAI).” arXiv, Jan. 26, 2021. doi: Fulfill Clinical Requirements?” arXiv, Mar. 12, 2022. doi:
10.48550/arXiv.2012.01007. 10.48550/arXiv.2203.06487.
[49] T. Perarasi, S. Vidhya, L. Moses M., and P. Ramya, “Malicious [68] J. Lu, D. Lee, T. W. Kim, and D. Danks, “Good Explanation for
Vehicles Identifying and Trust Management Algorithm for Enhance Algorithmic Transparency.” Rochester, NY, Nov. 11, 2019. doi:
the Security in 5G-VANET,” in 2020 Second International 10.2139/ssrn.3503603.
Conference on Inventive Research in Computing Applications [69] L. Amgoud and H. Prade, “Using arguments for making and
(ICIRCA), Jul. 2020, pp. 269–275. doi: explaining decisions,” Artificial Intelligence, vol. 173, no. 3, pp.
10.1109/ICIRCA48905.2020.9183184. 413–436, Mar. 2009, doi: 10.1016/j.artint.2008.11.006.
[50] G. Jaswal, V. Kanhangad, and R. Ramachandra, AI and Deep [70] M. Wu, M. Hughes, S. Parbhoo, M. Zazzi, V. Roth, and F. Doshi-
Learning in Biometric Security: Trends, Potential, and Challenges. Velez, “Beyond Sparsity: Tree Regularization of Deep Models for
CRC Press, 2021. Interpretability,” Proceedings of the AAAI Conference on Artificial
[51] “What is GDPR, the EU’s new data protection law?,” GDPR.eu, Intelligence, vol. 32, no. 1, Art. no. 1, Apr. 2018, doi:
Nov. 07, 2018. https://gdpr.eu/what-is-gdpr/ (accessed Jul. 08, 2022). 10.1609/aaai.v32i1.11501.
[52] C. T. Wolf, “Explainability scenarios: towards scenario-based XAI [71] H. Lakkaraju, E. Kamar, R. Caruana, and J. Leskovec, “Faithful and
design,” in Proceedings of the 24th International Conference on Customizable Explanations of Black Box Models,” in Proceedings
Intelligent User Interfaces, New York, NY, USA, Mar. 2019, pp. of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, New
252–257. doi: 10.1145/3301275.3302317. York, NY, USA, Jan. 2019, pp. 131–138. doi:
[53] A. Barredo Arrieta et al., “Explainable Artificial Intelligence (XAI): 10.1145/3306618.3314229.
Concepts, taxonomies, opportunities and challenges toward [72] G. Fidel, R. Bitton, and A. Shabtai, “When Explainability Meets
responsible AI,” Information Fusion, vol. 58, pp. 82–115, Jun. 2020, Adversarial Learning: Detecting Adversarial Examples using SHAP
doi: 10.1016/j.inffus.2019.12.012. Signatures.” arXiv, Sep. 08, 2019. doi: 10.48550/arXiv.1909.03418.
[54] D. V. Carvalho, E. M. Pereira, and J. S. Cardoso, “Machine Learning [73] W. Guo, “Explainable Artificial Intelligence for 6G: Improving
Interpretability: A Survey on Methods and Metrics,” Electronics, vol. Trust between Human and Machine,” IEEE Communications
8, no. 8, Art. no. 8, Aug. 2019, doi: 10.3390/electronics8080832. Magazine, vol. 58, no. 6, pp. 39–45, Jun. 2020, doi:
[55] V. Arya et al., “One Explanation Does Not Fit All: A Toolkit and 10.1109/MCOM.001.2000050.
Taxonomy of AI Explainability Techniques.” arXiv, Sep. 14, 2019. [74] F. Hussain, R. Hussain, and E. Hossain, “Explainable Artificial
doi: 10.48550/arXiv.1909.03012. Intelligence (XAI): An Engineering Perspective.” arXiv, Jan. 10,
[56] M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust 2021. doi: 10.48550/arXiv.2101.03613.
You?’: Explaining the Predictions of Any Classifier.” arXiv, Aug. 09, [75] D. Slack, S. Hilgard, E. Jia, S. Singh, and H. Lakkaraju, “Fooling
2016. doi: 10.48550/arXiv.1602.04938. LIME and SHAP: Adversarial Attacks on Post hoc Explanation
[57] A. Altmann, L. Toloşi, O. Sander, and T. Lengauer, “Permutation Methods.” arXiv, Feb. 03, 2020. doi: 10.48550/arXiv.1911.02508.
importance: a corrected feature importance measure,” Bioinformatics, [76] N. Papernot et al., “Technical Report on the CleverHans v2.1.0
vol. 26, no. 10, pp. 1340–1347, May 2010, doi: Adversarial Examples Library.” arXiv, Jun. 27, 2018. doi:
10.1093/bioinformatics/btq134. 10.48550/arXiv.1610.00768.
[58] R. Ying, D. Bourgeois, J. You, M. Zitnik, and J. Leskovec, [77] D. Gunning, M. Stefik, J. Choi, T. Miller, S. Stumpf, and G.-Z. Yang,
“GNNExplainer: Generating Explanations for Graph Neural “XAI—Explainable artificial intelligence,” Science Robotics, vol. 4,
Networks.” arXiv, Nov. 13, 2019. doi: 10.48550/arXiv.1903.03894. no. 37, p. eaay7120, Dec. 2019, doi: 10.1126/scirobotics.aay7120.
[59] S. M. Lundberg and S.-I. Lee, “A Unified Approach to Interpreting [78] C. Mars, R. Dès, and M. Boussard, “The three stages of Explainable
Model Predictions,” in Advances in Neural Information Processing AI: How explainability facilitates real world deployment of AI,” Jan.
Systems, 2017, vol. 30. Accessed: Jul. 09, 2022. [Online]. Available: 2020.
https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d7 [79] L. Longo, R. Goebel, F. Lecue, P. Kieseberg, and A. Holzinger,
6c43dfd28b67767-Abstract.html Explainable Artificial Intelligence: Concepts, Applications,
[60] R. Iyer, Y. Li, H. Li, M. Lewis, R. Sundar, and K. Sycara, Research Challenges and Visions. 2020, p. 16. doi: 10.1007/978-3-
“Transparency and Explanation in Deep Reinforcement Learning 030-57321-8_1.
Neural Networks.” arXiv, Sep. 17, 2018. doi: [80] “Evaluation of Post-hoc XAI Approaches Through Synthetic
10.48550/arXiv.1809.06061. Tabular Data | Foundations of Intelligent Systems,” Guide
[61] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and Proceedings. https://dl.acm.org/doi/abs/10.1007/978-3-030-59491-
D. Batra, “Grad-CAM: Visual Explanations from Deep Networks via 6_40 (accessed Jul. 10, 2022).
Gradient-based Localization,” Int J Comput Vis, vol. 128, no. 2, pp. [81] L. Arras, A. Osman, and W. Samek, “CLEVR-XAI: A benchmark
336–359, Feb. 2020, doi: 10.1007/s11263-019-01228-7. dataset for the ground truth evaluation of neural network
explanations,” Information Fusion, vol. 81, pp. 14–40, May 2022, symposium.org/ndss2014/programme/drebin-effective-and-
doi: 10.1016/j.inffus.2021.11.008. explainable-detection-android-malware-your-pocket/ (accessed Jul.
[82] A. Rai, “Explainable AI: from black box to glass box,” J. of the 13, 2022).
Acad. Mark. Sci., vol. 48, no. 1, pp. 137–141, Jan. 2020, doi: [102] T. A. Almeida, J. M. G. Hidalgo, and A. Yamakami, “Contributions
10.1007/s11747-019-00710-5. to the study of SMS spam filtering: new collection and results,” in
[83] E. LEMONNE, “Ethics Guidelines for Trustworthy AI,” Proceedings of the 11th ACM symposium on Document engineering,
FUTURIUM - European Commission, Dec. 17, 2018. New York, NY, USA, Sep. 2011, pp. 259–262. doi:
https://ec.europa.eu/futurium/en/ai-alliance-consultation (accessed 10.1145/2034691.2034742.
Jul. 10, 2022). [103] V. Metsis, I. Androutsopoulos, and G. Paliouras, “Spam Filtering
[84] European Parliament. Directorate General for Parliamentary with Naive Bayes - Which Naive Bayes?,” Jan. 2006.
Research Services., The impact of the general data protection [104] M. S. I. Mamun, M. A. Rathore, A. H. Lashkari, N. Stakhanova, and
regulation on artificial intelligence. LU: Publications Office, 2020. A. A. Ghorbani, “Detecting Malicious URLs Using Lexical
Accessed: Jul. 10, 2022. [Online]. Available: Analysis,” in Network and System Security, Cham, 2016, pp. 467–
https://data.europa.eu/doi/10.2861/293 482. doi: 10.1007/978-3-319-46298-1_30.
[85] M. Ebers, “Regulating Explainable AI in the European Union. An [105] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed
Overview of the Current Legal Framework(s).” Rochester, NY, Aug. analysis of the KDD CUP 99 data set,” in 2009 IEEE Symposium on
09, 2021. doi: 10.2139/ssrn.3901732. Computational Intelligence for Security and Defense Applications,
[86] Z. C. Lipton, “The Mythos of Model Interpretability: In machine Jul. 2009, pp. 1–6. doi: 10.1109/CISDA.2009.5356528.
learning, the concept of interpretability is both important and [106] A. Shiravi, H. Shiravi, M. Tavallaee, and A. A. Ghorbani, “Toward
slippery.,” Queue, vol. 16, no. 3, pp. 31–57, Jun. 2018, doi: developing a systematic approach to generate benchmark datasets for
10.1145/3236386.3241340. intrusion detection,” Computers & Security, vol. 31, no. 3, pp. 357–
[87] “Explainability for artificial intelligence in healthcare: a 374, May 2012, doi: 10.1016/j.cose.2011.12.012.
multidisciplinary perspective | BMC Medical Informatics and [107] C. Kolias, G. Kambourakis, A. Stavrou, and S. Gritzalis, “Intrusion
Decision Making | Full Text.” Detection in 802.11 Networks: Empirical Evaluation of Threats and
https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/ a Public Dataset,” IEEE Communications Surveys & Tutorials, vol.
s12911-020-01332-6 (accessed Jul. 11, 2022). 18, no. 1, pp. 184–208, 2016, doi: 10.1109/COMST.2015.2402161.
[88] A. Holzinger, “Explainable AI and Multi-Modal Causability in [108] “Toward Generating a New Intrusion Detection Dataset and
Medicine,” i-com, vol. 19, no. 3, pp. 171–179, Dec. 2020, doi: Intrusion Traffic Characterization | Request PDF.”
10.1515/icom-2020-0024. https://www.researchgate.net/publication/322870768_Toward_Gene
[89] S. Wachter, B. Mittelstadt, and C. Russell, “Why Fairness Cannot Be rating_a_New_Intrusion_Detection_Dataset_and_Intrusion_Traffic_
Automated: Bridging the Gap Between EU Non-Discrimination Law Characterization (accessed Jul. 13, 2022).
and AI,” SSRN Journal, 2020, doi: 10.2139/ssrn.3547922. [109] I. Sharafaldin, A. H. Lashkari, S. Hakak, and A. A. Ghorbani,
[90] A. Holzinger, G. Langs, H. Denk, K. Zatloukal, and H. Müller, “Developing Realistic Distributed Denial of Service (DDoS) Attack
“Causability and explainability of artificial intelligence in medicine,” Dataset and Taxonomy,” in 2019 International Carnahan
WIREs Data Mining and Knowledge Discovery, vol. 9, no. 4, p. Conference on Security Technology (ICCST), 2019, pp. 1–8. doi:
e1312, 2019, doi: 10.1002/widm.1312. 10.1109/CCST.2019.8888419.
[91] S. Tonekaboni, S. Joshi, M. D. McCradden, and A. Goldenberg, [110] A. Alsaedi, N. Moustafa, Z. Tari, A. Mahmood, and A. Anwar,
“What Clinicians Want: Contextualizing Explainable Machine “TON_IoT Telemetry Dataset: A New Generation Dataset of IoT
Learning for Clinical End Use.” arXiv, Aug. 07, 2019. doi: and IIoT for Data-Driven Intrusion Detection Systems,” IEEE
10.48550/arXiv.1905.05134. Access, vol. 8, pp. 165130–165150, 2020, doi:
[92] O. Yavanoglu and M. Aydos, “A review on cyber security datasets 10.1109/ACCESS.2020.3022862.
for machine learning algorithms,” in 2017 IEEE International [111] R. Damasevicius et al., “LITNET-2020: An Annotated Real-World
Conference on Big Data (Big Data), 2017, pp. 2186–2193. doi: Network Flow Dataset for Network Intrusion Detection,” Electronics,
10.1109/BigData.2017.8258167. vol. 9, no. 5, Art. no. 5, May 2020, doi: 10.3390/electronics9050800.
[93] Y. Xin et al., “Machine Learning and Deep Learning Methods for [112] G. Creech and J. Hu, “Generation of a new IDS test dataset: Time to
Cybersecurity,” IEEE Access, vol. 6, pp. 35365–35381, 2018, doi: retire the KDD collection,” in 2013 IEEE Wireless Communications
10.1109/ACCESS.2018.2836950. and Networking Conference (WCNC), Apr. 2013, pp. 4487–4492.
[94] Y. Meidan et al., “N-BaIoT—Network-Based Detection of IoT doi: 10.1109/WCNC.2013.6555301.
Botnet Attacks Using Deep Autoencoders,” IEEE Pervasive [113] N. Moustafa and J. Slay, “UNSW-NB15: a comprehensive data set
Computing, vol. 17, no. 3, pp. 12–22, 2018, doi: for network intrusion detection systems (UNSW-NB15 network data
10.1109/MPRV.2018.03367731. set),” in 2015 Military Communications and Information Systems
[95] Y. M. P. Pa, S. Suzuki, K. Yoshioka, T. Matsumoto, T. Kasama, and Conference (MilCIS), 2015, pp. 1–6. doi:
C. Rossow, “IoTPOT: A Novel Honeypot for Revealing Current IoT 10.1109/MilCIS.2015.7348942.
Threats,” Journal of Information Processing, vol. 24, no. 3, pp. 522– [114] S. García, M. Grill, J. Stiborek, and A. Zunino, “An empirical
533, 2016, doi: 10.2197/ipsjjip.24.522. comparison of botnet detection methods,” Computers & Security, vol.
[96] S. Garcia, A. Parmisano, and M. J. Erquiaga, “IoT-23: A labeled 45, pp. 100–123, Sep. 2014, doi: 10.1016/j.cose.2014.05.011.
dataset with malicious and benign IoT network traffic.” Zenodo, [115] S. Saad et al., “Detecting P2P botnets through network behavior
2020. doi: 10.5281/zenodo.4743746. analysis and machine learning,” in 2011 Ninth Annual International
[97] H. S. Anderson and P. Roth, “EMBER: An Open Dataset for Conference on Privacy, Security and Trust, Jul. 2011, pp. 174–180.
Training Static PE Malware Machine Learning Models.” arXiv, Apr. doi: 10.1109/PST.2011.5971980.
16, 2018. doi: 10.48550/arXiv.1804.04637. [116] N. Koroniotis, N. Moustafa, E. Sitnikova, and B. Turnbull,
[98] Y. Zhou and X. Jiang, “Dissecting Android Malware: “Towards the development of realistic botnet dataset in the Internet
Characterization and Evolution,” in 2012 IEEE Symposium on of Things for network forensic analytics: Bot-IoT dataset,” Future
Security and Privacy, 2012, pp. 95–109. doi: 10.1109/SP.2012.16. Generation Computer Systems, vol. 100, pp. 779–796, Nov. 2019,
[99] “VirusShare.com.” https://virusshare.com/ (accessed Jul. 13, 2022). doi: 10.1016/j.future.2019.05.041.
[100] A. H. Lashkari, A. F. A. Kadir, L. Taheri, and A. A. Ghorbani, [117] M. Zago, M. Gil Pérez, and G. Martínez Pérez, “UMUDGA: A
“Toward Developing a Systematic Approach to Generate dataset for profiling DGA-based botnet,” Computers & Security, vol.
Benchmark Android Malware Datasets and Classification,” in 2018 92, p. 101719, May 2020, doi: 10.1016/j.cose.2020.101719.
International Carnahan Conference on Security Technology [118]R. Vinayakumar, K. P. Soman, P. Poornachandran, M. Alazab, and S.
(ICCST), 2018, pp. 1–7. doi: 10.1109/CCST.2018.8585560. M. Thampi, “AmritaDGA: A comprehensive data set for domain
[101] “Drebin: Effective and Explainable Detection of Android Malware generation algorithms (DGAs) based domain name detection
in Your Pocket – NDSS Symposium.” https://www.ndss- systems and application of deep learning,” in Big Data
Recommender Systems, vol. 2, O. Khalid, S. U. Khan, and A. Y. [137] J. Yuan, Y. Zheng, X. Xie, and G. Sun, “Driving with knowledge
Zomaya, Eds. Stevenage: Institution of Engineering and Technology, from the physical world,” in Proceedings of the 17th ACM SIGKDD
2019, pp. 455–485. doi: 10.1049/PBPC035G_ch22. international conference on Knowledge discovery and data mining,
[119] H.-K. Shin, W. Lee, J.-H. Yun, and H. Kim, “{HAI} 1.0: {HIL- New York, NY, USA, Aug. 2011, pp. 316–324. doi:
based} Augmented {ICS} Security Dataset,” 2020. Accessed: Jul. 13, 10.1145/2020408.2020462.
2022. [Online]. Available: [138] Y. Zheng, L. Zhang, X. Xie, and W.-Y. Ma, “Mining interesting
https://www.usenix.org/conference/cset20/presentation/shin locations and travel sequences from GPS trajectories,” in
[120] R. C. Borges Hink, J. M. Beaver, M. A. Buckner, T. Morris, U. Proceedings of the 18th international conference on World wide web,
Adhikari, and S. Pan, “Machine learning for power system New York, NY, USA, Apr. 2009, pp. 791–800. doi:
disturbance and cyber-attack discrimination,” in 2014 7th 10.1145/1526709.1526816.
International Symposium on Resilient Control Systems (ISRCS), [139] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets
2014, pp. 1–8. doi: 10.1109/ISRCS.2014.6900095. robotics: The KITTI dataset,” The International Journal of Robotics
[121] M. S. Elsayed, N.-A. Le-Khac, and A. D. Jurcut, “InSDN: A Novel Research, vol. 32, no. 11, pp. 1231–1237, Sep. 2013, doi:
SDN Intrusion Dataset,” IEEE Access, vol. 8, pp. 165263–165284, 10.1177/0278364913491297.
2020, doi: 10.1109/ACCESS.2020.3022633. [140] D. P. Hughes and M. Salathe, “An open access repository of images
[122] K. Marek et al., “The Parkinson Progression Marker Initiative on plant health to enable the development of mobile disease
(PPMI),” Progress in Neurobiology, vol. 95, no. 4, pp. 629–635, Dec. diagnostics.” arXiv, Apr. 11, 2016. doi: 10.48550/arXiv.1511.08060.
2011, doi: 10.1016/j.pneurobio.2011.09.005. [141] “photometric stereo-based 3D imaging system using computer vision
[123] L. Cui and D. Lee, “CoAID: COVID-19 Healthcare Misinformation and deep learning for tracking plant growth | GigaScience | Oxford
Dataset.” arXiv, Nov. 03, 2020. doi: 10.48550/arXiv.2006.00885. Academic.”
[124] D. Dave, H. Naik, S. Singhal, and P. Patel, “Explainable AI meets https://academic.oup.com/gigascience/article/8/5/giz056/5498634?lo
Healthcare: A Study on Heart Disease Dataset.” arXiv, Nov. 06, gin=true (accessed Jul. 14, 2022).
2020. doi: 10.48550/arXiv.2011.03195. [142] R. Thapa, N. Snavely, S. Belongie, and A. Khan, “The Plant
[125] A. E. W. Johnson et al., “MIMIC-III, a freely accessible critical care Pathology 2020 challenge dataset to classify foliar disease of
database,” Sci Data, vol. 3, no. 1, Art. no. 1, May 2016, doi: apples.” arXiv, Apr. 24, 2020. doi: 10.48550/arXiv.2004.11958.
10.1038/sdata.2016.35. [143] E. Vural, J. Huang, D. Hou, and S. Schuckers, “Shared research
[126] M. Saeed et al., “Multiparameter Intelligent Monitoring in Intensive dataset to support development of keystroke authentication,” in
Care II: a public-access intensive care unit database,” Crit Care Med, IEEE International Joint Conference on Biometrics, Sep. 2014, pp.
vol. 39, no. 5, pp. 952–960, May 2011, doi: 1–8. doi: 10.1109/BTAS.2014.6996259.
10.1097/CCM.0b013e31820a92c6. [144] D. Gunetti and C. Picardi, “Keystroke analysis of free text,” TSEC,
[127] P. Wagner et al., “PTB-XL, a large publicly available 2005, doi: 10.1145/1085126.1085129.
electrocardiography dataset,” Sci Data, vol. 7, no. 1, Art. no. 1, May [145] Y. Sun, H. Ceker, and S. Upadhyaya, “Shared keystroke dataset for
2020, doi: 10.1038/s41597-020-0495-6. continuous authentication,” in 2016 IEEE International Workshop
[128] F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L. Heutte, “Breast on Information Forensics and Security (WIFS), 2016, pp. 1–6. doi:
cancer histopathological image classification using Convolutional 10.1109/WIFS.2016.7823894.
Neural Networks,” in 2016 International Joint Conference on [146] S. Ng, “Opportunities and Challenges: Lessons from Analyzing
Neural Networks (IJCNN), Jul. 2016, pp. 2560–2567. doi: Terabytes of Scanner Data.” National Bureau of Economic Research,
10.1109/IJCNN.2016.7727519. Aug. 2017. doi: 10.3386/w23673.
[129] F. Liu et al., “An Open Access Database for Evaluating the [147] “UCI Machine Learning Repository: Statlog (German Credit Data)
Algorithms of Electrocardiogram Rhythm and Morphology Data Set.”
Abnormality Detection,” Journal of Medical Imaging and Health https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)
Informatics, vol. 8, no. 7, pp. 1368–1373, Sep. 2018, doi: (accessed Jul. 14, 2022).
10.1166/jmihi.2018.2442. [148] T. K. Lengyel, S. Maresca, B. D. Payne, G. D. Webster, S. Vogl, and
[130] Y. Gusev, K. Bhuvaneshwar, L. Song, J.-C. Zenklusen, H. Fine, and A. Kiayias, “Scalability, fidelity and stealth in the DRAKVUF
S. Madhavan, “The REMBRANDT study, a large collection of dynamic malware analysis system,” in Proceedings of the 30th
genomic data from brain cancer patients,” Sci Data, vol. 5, no. 1, Art. Annual Computer Security Applications Conference, New York, NY,
no. 1, Aug. 2018, doi: 10.1038/sdata.2018.158. USA, Dec. 2014, pp. 386–395. doi: 10.1145/2664243.2664252.
[131] R. L. Bowman, Q. Wang, A. Carro, R. G. W. Verhaak, and M. [149] Y. Ye, T. Li, D. Adjeroh, and S. S. Iyengar, “A Survey on Malware
Squatrito, “GlioVis data portal for visualization and analysis of brain Detection Using Data Mining Techniques,” ACM Comput. Surv., vol.
tumor expression datasets,” Neuro-Oncology, vol. 19, no. 1, pp. 50, no. 3, p. 41:1-41:40, Jun. 2017, doi: 10.1145/3073559.
139–141, Jan. 2017, doi: 10.1093/neuonc/now247. [150]R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, and S.
[132] S. Uppoor, O. Trullols-Cruces, M. Fiore, and J. M. Barcelo-Ordinas, Venkatraman, “Robust Intelligent Malware Detection Using Deep
“Generation and Analysis of a Large-Scale Urban Vehicular Learning,” IEEE Access, vol. 7, pp. 46717–46738, 2019, doi:
Mobility Dataset,” IEEE Transactions on Mobile Computing, vol. 13, 10.1109/ACCESS.2019.2906934.
no. 5, pp. 1061–1075, 2014, doi: 10.1109/TMC.2013.27. [151] A. Yan et al., “Effective detection of mobile malware behavior
[133]P. R. L. de Almeida, L. S. Oliveira, A. S. Britto, E. J. Silva, and A. L. based on explainable deep neural network,” Neurocomputing, vol.
Koerich, “PKLot – A robust dataset for parking lot classification,” 453, pp. 482–492, Sep. 2021, doi: 10.1016/j.neucom.2020.09.082.
Expert Systems with Applications, vol. 42, no. 11, pp. 4937–4949, [152] M. Melis, D. Maiorca, B. Biggio, G. Giacinto, and F. Roli,
Jul. 2015, doi: 10.1016/j.eswa.2015.02.009. “Explaining Black-box Android Malware Detection,” in 2018 26th
[134] “[PDF] Fast Global Alignment Kernels | Semantic Scholar.” European Signal Processing Conference (EUSIPCO), Sep. 2018, pp.
https://www.semanticscholar.org/paper/Fast-Global-Alignment- 524–528. doi: 10.23919/EUSIPCO.2018.8553598.
Kernels-Cuturi/7de1f5079ed7a8a8a5690f72ad2099f52d697393 [153] D. Arp, M. Spreitzenbarth, M. Hübner, H. Gascon, and K. Rieck,
(accessed Jul. 14, 2022). “Drebin: Effective and Explainable Detection of Android Malware
[135] G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Meghini, and C. in Your Pocket,” San Diego, CA, 2014. doi:
Vairo, “Deep learning for decentralized parking lot occupancy 10.14722/ndss.2014.23247.
detection,” Expert Systems with Applications, vol. 72, pp. 327–334, [154] S. Bose, T. Barao, and X. Liu, “Explaining AI for Malware
Apr. 2017, doi: 10.1016/j.eswa.2016.10.055. Detection: Analysis of Mechanisms of MalConv,” in 2020
[136] G. Oh, D. J. Leblanc, and H. Peng, “Vehicle Energy Dataset (VED), International Joint Conference on Neural Networks (IJCNN), Jul.
A Large-Scale Dataset for Vehicle Energy Consumption Research,” 2020, pp. 1–8. doi: 10.1109/IJCNN48605.2020.9207322.
IEEE Transactions on Intelligent Transportation Systems, vol. 23, [155] E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. K.
no. 4, pp. 3302–3312, Apr. 2022, doi: 10.1109/TITS.2020.3035596. Nicholas, “Malware Detection by Eating a Whole EXE,” Jun. 2018.
Accessed: Jul. 18, 2022. [Online]. Available:
https://www.aaai.org/ocs/index.php/WS/AAAIW18/paper/view/164 [175] S. S. C. Silva, R. M. P. Silva, R. C. G. Pinto, and R. M. Salles,
22 “Botnets: A survey,” Computer Networks, vol. 57, no. 2, pp. 378–
[156] H. S. Anderson and P. Roth, “EMBER: An Open Dataset for 403, Feb. 2013, doi: 10.1016/j.comnet.2012.07.021.
Training Static PE Malware Machine Learning Models.” arXiv, Apr. [176] “Botnet Detection Market Global Industry Historical Analysis, Size,
16, 2018. doi: 10.48550/arXiv.1804.04637. Growth, Trends, Emerging Factors, Demands, Key Players,
[157] H. Naeem, B. M. Alshammari, and F. Ullah, “Explainable Artificial Emerging Technologies and Potential of Industry till 2027 -
Intelligence-Based IoT Device Malware Detection Mechanism MarketWatch.” https://www.marketwatch.com/press-release/botnet-
Using Image Visualization and Fine-Tuned CNN-Based Transfer detection-market-global-industry-historical-analysis-size-growth-
Learning Model,” Computational Intelligence and Neuroscience, vol. trends-emerging-factors-demands-key-players-emerging-
2022, p. e7671967, Jul. 2022, doi: 10.1155/2022/7671967. technologies-and-potential-of-industry-till-2027-2022-06-29
[158] A. Yan et al., “Effective detection of mobile malware behavior (accessed Jul. 19, 2022).
based on explainable deep neural network,” Neurocomputing, vol. [177] O. Tsemogne, Y. Hayel, C. Kamhoua, and G. Deugoué, “Game-
453, pp. 482–492, Sep. 2021, doi: 10.1016/j.neucom.2020.09.082. Theoretic Modeling of Cyber Deception Against Epidemic Botnets
[159] G. Iadarola, F. Martinelli, F. Mercaldo, and A. Santone, “Towards an in Internet of Things,” IEEE Internet of Things Journal, vol. 9, no. 4,
interpretable deep learning model for mobile malware detection and pp. 2678–2687, 2022, doi: 10.1109/JIOT.2021.3081751.
family identification,” Computers & Security, vol. 105, p. 102198, [178] H. Suryotrisongko, Y. Musashi, A. Tsuneda, and K. Sugitani,
Jun. 2021, doi: 10.1016/j.cose.2021.102198. “Robust Botnet DGA Detection: Blending XAI and OSINT for
[160] S. Wang et al., “TrafficAV: An effective and explainable detection Cyber Threat Intelligence Sharing,” IEEE Access, vol. 10, pp.
of mobile malware behavior using network traffic,” in 2016 34613–34624, 2022, doi: 10.1109/ACCESS.2022.3162588.
IEEE/ACM 24th International Symposium on Quality of Service [179] S. Araki, K. Takahashi, B. Hu, K. Kamiya, and M. Tanikawa,
(IWQoS), Jun. 2016, pp. 1–6. doi: 10.1109/IWQoS.2016.7590446. “Subspace Clustering for Interpretable Botnet Traffic Analysis,” in
[161] M. M. Alani and A. I. Awad, “PAIRED: An Explainable ICC 2019 - 2019 IEEE International Conference on
Lightweight Android Malware Detection System,” IEEE Access, pp. Communications (ICC), 2019, pp. 1–6. doi:
1–1, 2022, doi: 10.1109/ACCESS.2022.3189645. 10.1109/ICC.2019.8761218.
[162] M. Kinkead, S. Millar, N. McLaughlin, and P. O’Kane, “Towards [180] “MAWI Working Group Traffic Archive.”
Explainable CNNs for Android Malware Detection,” Procedia http://mawi.wide.ad.jp/mawi/ (accessed Jul. 19, 2022).
Computer Science, vol. 184, pp. 959–965, Jan. 2021, doi: [181] M. Mazza, S. Cresci, M. Avvenuti, W. Quattrociocchi, and M.
10.1016/j.procs.2021.03.118. Tesconi, “RTbust: Exploiting Temporal Patterns for Botnet
[163] E. G. Dada, J. S. Bassi, H. Chiroma, S. M. Abdulhamid, A. O. Detection on Twitter,” in Proceedings of the 10th ACM Conference
Adetunmbi, and O. E. Ajibuwa, “Machine learning for email spam on Web Science, New York, NY, USA, Jun. 2019, pp. 183–192. doi:
filtering: review, approaches and open research problems,” Heliyon, 10.1145/3292522.3326015.
vol. 5, no. 6, p. e01802, Jun. 2019, doi: [182] H. Bahşi, S. Nõmm, and F. B. La Torre, “Dimensionality Reduction
10.1016/j.heliyon.2019.e01802. for Machine Learning Based IoT Botnet Detection,” in 2018 15th
[164] “Daily number of e-mails worldwide 2025,” Statista. International Conference on Control, Automation, Robotics and
https://www.statista.com/statistics/456500/daily-number-of-e-mails- Vision (ICARCV), 2018, pp. 1857–1862. doi:
worldwide/ (accessed Feb. 21, 2022). 10.1109/ICARCV.2018.8581205.
[165] A. Karim, S. Azam, B. Shanmugam, K. Kannoorpatti, and M. [183] P. P. Kundu, T. Truong-Huu, L. Chen, L. Zhou, and S. G. Teo,
Alazab, “A Comprehensive Survey for Intelligent Spam Email “Detection and Classification of Botnet Traffic using Deep Learning
Detection,” IEEE Access, vol. 7, pp. 168261–168295, 2019, doi: with Model Explanation,” IEEE Transactions on Dependable and
10.1109/ACCESS.2019.2954791. Secure Computing, pp. 1–15, 2022, doi:
[166] R. R. Hoffman, S. T. Mueller, G. Klein, and J. Litman, “Metrics for 10.1109/TDSC.2022.3183361.
Explainable AI: Challenges and Prospects.” arXiv, Feb. 01, 2019. [184] M. M. Alani, “BotStop : Packet-based efficient and explainable IoT
doi: 10.48550/arXiv.1812.04608. botnet detection using machine learning,” Computer
[167] M. Renftle, H. Trittenbach, M. Poznic, and R. Heil, “Explaining Any Communications, vol. 193, pp. 53–62, Sep. 2022, doi:
ML Model? -- On Goals and Capabilities of XAI,” Jun. 2022, doi: 10.1016/j.comcom.2022.06.039.
10.48550/arXiv.2206.13888. [185] D. Buil-Gil, F. Miró-Llinares, A. Moneva, S. Kemp, and N. Díaz-
[168] J. C. S. Reis, A. Correia, F. Murai, A. Veloso, and F. Benevenuto, Castaño, “Cybercrime and shifts in opportunities during COVID-19:
“Explainable Machine Learning for Fake News Detection,” in a preliminary analysis in the UK,” European Societies, vol. 23, no.
Proceedings of the 10th ACM Conference on Web Science, New sup1, pp. S47–S59, Feb. 2021, doi:
York, NY, USA, Jun. 2019, pp. 17–26. doi: 10.1080/14616696.2020.1804973.
10.1145/3292522.3326027. [186] J. Gee and P. M. Button, “The Financial Cost of Fraud 2019,” p. 28.
[169] P. Hacker, R. Krestel, S. Grundmann, and F. Naumann, “Explainable [187] I. Psychoula, A. Gutmann, P. Mainali, S. H. Lee, P. Dunphy, and F.
AI under contract and tort law: legal incentives and technical Petitcolas, “Explainable Machine Learning for Fraud Detection,”
challenges,” Artif Intell Law, vol. 28, no. 4, pp. 415–439, Dec. 2020, Computer, vol. 54, no. 10, pp. 49–59, 2021, doi:
doi: 10.1007/s10506-020-09260-6. 10.1109/MC.2021.3081249.
[170] T. Almeı̇ da, J. M. Hı̇ dalgo, and T. Sı̇ lva, “Towards SMS Spam [188] “IEEE-CIS Fraud Detection.” https://kaggle.com/competitions/ieee-
Filtering: Results under a New Dataset,” International Journal of fraud-detection (accessed Jul. 20, 2022).
Information Security Science, vol. 2, no. 1, Art. no. 1, Mar. 2013. [189] D. Farrugia, C. Zerafa, T. Cini, B. Kuasney, and K. Livori, “A Real-
[171] B. Mathew, P. Saha, S. M. Yimam, C. Biemann, P. Goyal, and A. Time Prescriptive Solution for Explainable Cyber-Fraud Detection
Mukherjee, “HateXplain: A Benchmark Dataset for Explainable Within the iGaming Industry,” SN COMPUT. SCI., vol. 2, no. 3, p.
Hate Speech Detection,” Proceedings of the AAAI Conference on 215, Apr. 2021, doi: 10.1007/s42979-021-00623-7.
Artificial Intelligence, vol. 35, no. 17, Art. no. 17, May 2021. [190] S. X. Rao et al., “xFraud: Explainable Fraud Transaction Detection,”
[172] Z. Zhang, D. Robinson, and J. Tepper, “Detecting Hate Speech on Proc. VLDB Endow., vol. 15, no. 3, pp. 427–436, Nov. 2021, doi:
Twitter Using a Convolution-GRU Based Deep Neural Network,” in 10.14778/3494124.3494128.
The Semantic Web, Cham, 2018, pp. 745–760. doi: 10.1007/978-3- [191] Z. Hu, Y. Dong, K. Wang, and Y. Sun, “Heterogeneous Graph
319-93417-4_48. Transformer,” in Proceedings of The Web Conference 2020, New
[173] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural York, NY, USA: Association for Computing Machinery, 2020, pp.
networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, 2704–2710. Accessed: Jul. 20, 2022. [Online]. Available:
pp. 2673–2681, 1997, doi: 10.1109/78.650093. https://doi.org/10.1145/3366423.3380027
[174] B. Liu and I. Lane, “Attention-Based Recurrent Neural Network [192] Z. Liu, C. Chen, X. Yang, J. Zhou, X. Li, and L. Song,
Models for Joint Intent Detection and Slot Filling.” arXiv, Sep. 06, “Heterogeneous Graph Neural Networks for Malicious Account
2016. doi: 10.48550/arXiv.1609.01454. Detection,” in Proceedings of the 27th ACM International
Conference on Information and Knowledge Management, New York, [209] K. Lee, B. Eoff, and J. Caverlee, “Seven Months with the Devils: A
NY, USA, Oct. 2018, pp. 2077–2085. doi: Long-Term Study of Content Polluters on Twitter,” Proceedings of
10.1145/3269206.3272010. the International AAAI Conference on Web and Social Media, vol. 5,
[193] K. Roshan and A. Zafar, “Utilizing XAI technique to improve no. 1, Art. no. 1, 2011.
autoencoder based model for computer network anomaly detection [210] H. Liu, C. Zhong, A. Alnusair, and S. R. Islam, “FAIXID: A
with shapley additive explanation(SHAP).” Dec. 14, 2021. doi: Framework for Enhancing AI Explainability of Intrusion Detection
10.5121/ijcnc.2021.13607. Results Using Data Cleaning Techniques,” J Netw Syst Manage, vol.
[194] Y. Zhu et al., “Modeling Users’ Behavior Sequences with 29, no. 4, p. 40, May 2021, doi: 10.1007/s10922-021-09606-8.
Hierarchical Explainable Network for Cross-domain Fraud [211] S. Mane and D. Rao, “Explaining Network Intrusion Detection
Detection,” in Proceedings of The Web Conference 2020, New York, System Using Explainable AI Framework.” arXiv, Mar. 12, 2021.
NY, USA: Association for Computing Machinery, 2020, pp. 928– doi: 10.48550/arXiv.2103.07110.
938. Accessed: Jul. 20, 2022. [Online]. Available: [212] B. Mahbooba, M. Timilsina, R. Sahal, and M. Serrano, “Explainable
https://doi.org/10.1145/3366423.3380172 Artificial Intelligence (XAI) to Enhance Trust Management in
[195] K. Yang and W. Xu, FraudMemory: Explainable Memory-Enhanced Intrusion Detection Systems Using Decision Tree Model,”
Sequential Neural Networks for Financial Fraud Detection. 2019. Complexity, vol. 2021, p. e6634811, Jan. 2021, doi:
Accessed: Jul. 20, 2022. [Online]. Available: 10.1155/2021/6634811.
http://hdl.handle.net/10125/59542 [213] S. Wali and I. Khan, “Explainable AI and Random Forest Based
[196] Z. Xiao and J. Jiao, “Explainable Fraud Detection for Few Labeled Reliable Intrusion Detection system.” TechRxiv, Dec. 18, 2021. doi:
Time Series Data,” Security and Communication Networks, vol. 10.36227/techrxiv.17169080.v1.
2021, p. e9941464, Jun. 2021, doi: 10.1155/2021/9941464. [214] M. Ghurab, G. Gaphari, F. Alshami, R. Alshamy, and S. Othman, “A
[197] W. Min, W. Liang, H. Yin, Z. Wang, M. Li, and A. Lal, Detailed Analysis of Benchmark Datasets for Network Intrusion
“Explainable Deep Behavioral Sequence Clustering for Transaction Detection System.” Rochester, NY, Apr. 14, 2021. Accessed: Jul. 22,
Fraud Detection.” arXiv, Jan. 11, 2021. doi: 2022. [Online]. Available: https://papers.ssrn.com/abstract=3834787
10.48550/arXiv.2101.04285. [215] T. Zebin, S. Rezvy, and Y. Luo, “An Explainable AI-Based Intrusion
[198] A. Das, S. Baki, A. El Aassal, R. Verma, and A. Dunbar, “SoK: A Detection System for DNS Over HTTPS (DoH) Attacks,” IEEE
Comprehensive Reexamination of Phishing Research From the Transactions on Information Forensics and Security, vol. 17, pp.
Security Perspective,” IEEE Communications Surveys & Tutorials, 2339–2349, 2022, doi: 10.1109/TIFS.2022.3183390.
vol. 22, no. 1, pp. 671–708, 2020, doi: [216] M. MontazeriShatoori, L. Davidson, G. Kaur, and A. Habibi
10.1109/COMST.2019.2957750. Lashkari, “Detection of DoH Tunnels using Time-series
[199] Y. Chai, Y. Zhou, W. Li, and Y. Jiang, “An Explainable Multi- Classification of Encrypted Traffic,” in 2020 IEEE Intl Conf on
Modal Hierarchical Attention Model for Developing Phishing Threat Dependable, Autonomic and Secure Computing, Intl Conf on
Intelligence,” IEEE Transactions on Dependable and Secure Pervasive Intelligence and Computing, Intl Conf on Cloud and Big
Computing, vol. 19, no. 2, pp. 790–803, Mar. 2022, doi: Data Computing, Intl Conf on Cyber Science and Technology
10.1109/TDSC.2021.3119323. Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020, pp. 63–70.
[200] P. R. Galego Hernandes, C. P. Floret, K. F. Cardozo De Almeida, V. doi: 10.1109/DASC-PICom-CBDCom-
C. Da Silva, J. P. Papa, and K. A. Pontara Da Costa, “Phishing CyberSciTech49142.2020.00026.
Detection Using URL-based XAI Techniques,” in 2021 IEEE [217] Y. Li, K. Xiong, T. Chin, and C. Hu, “A Machine Learning
Symposium Series on Computational Intelligence (SSCI), 2021, pp. Framework for Domain Generation Algorithm-Based Malware
01–06. doi: 10.1109/SSCI50451.2021.9659981. Detection,” IEEE Access, vol. 7, pp. 32765–32782, 2019, doi:
[201] O. K. Sahingoz, E. Buber, O. Demir, and B. Diri, “Machine learning 10.1109/ACCESS.2019.2891588.
based phishing detection from URLs,” Expert Systems with [218] F. Becker, A. Drichel, C. Müller, and T. Ertl, “Interpretable
Applications, vol. 117, pp. 345–357, Mar. 2019, doi: Visualizations of Deep Neural Networks for Domain Generation
10.1016/j.eswa.2018.09.029. Algorithm Detection,” in 2020 IEEE Symposium on Visualization for
[202] Y. Lin et al., “Phishpedia: A Hybrid Deep Learning Based Approach Cyber Security (VizSec), 2020, pp. 25–29. doi:
to Visually Identify Phishing Webpages,” 2021, pp. 3793–3810. 10.1109/VizSec51108.2020.00010.
Accessed: Jul. 21, 2022. [Online]. Available: [219] A. Drichel, N. Faerber, and U. Meyer, “First Step Towards
https://www.usenix.org/conference/usenixsecurity21/presentation/lin EXPLAINable DGA Multiclass Classification,” in The 16th
[203] R. Valecha, P. Mandaokar, and H. R. Rao, “Phishing Email International Conference on Availability, Reliability and Security,
Detection Using Persuasion Cues,” IEEE Transactions on New York, NY, USA, Aug. 2021, pp. 1–13. doi:
Dependable and Secure Computing, vol. 19, no. 2, pp. 747–756, Mar. 10.1145/3465481.3465749.
2022, doi: 10.1109/TDSC.2021.3118931. [220] D. Plohmann, K. Yakdan, M. Klatt, J. Bader, and E. Gerhards-
[204] P. Barnard, N. Marchetti, and L. A. D. Silva, “Robust Network Padilla, “A Comprehensive Measurement Study of Domain
Intrusion Detection through Explainable Artificial Intelligence Generating Malware,” 2016, pp. 263–278. Accessed: Jul. 23, 2022.
(XAI),” IEEE Networking Letters, pp. 1–1, 2022, doi: [Online]. Available:
10.1109/LNET.2022.3186589. https://www.usenix.org/conference/usenixsecurity16/technical-
[205] G. Andresini, A. Appice, F. P. Caforio, D. Malerba, and G. Vessio, sessions/presentation/plohmann
“ROULETTE: A neural attention multi-output model for explainable [221] “Home - eduroam.org,” eduroam.org - eduroam global site.
Network Intrusion Detection,” Expert Systems with Applications, vol. https://eduroam.org/ (accessed Jul. 23, 2022).
201, p. 117144, Sep. 2022, doi: 10.1016/j.eswa.2022.117144. [222] A. Drichel, U. Meyer, S. Schüppen, and D. Teubert, “Analyzing the
[206] Z. A. E. Houda, B. Brik, and L. Khoukhi, “‘Why Should I Trust real-world applicability of DGA classifiers,” in Proceedings of the
Your IDS?’: An Explainable Deep Learning Framework for 15th International Conference on Availability, Reliability and
Intrusion Detection Systems in Internet of Things Networks,” IEEE Security, New York, NY, USA, Aug. 2020, pp. 1–11. doi:
Open Journal of the Communications Society, pp. 1–1, 2022, doi: 10.1145/3407023.3407030.
10.1109/OJCOMS.2022.3188750. [223] R. H. Jhaveri, S. J. Patel, and D. C. Jinwala, “DoS Attacks in Mobile
[207] “Network Intrusion Detection Based on Explainable Artificial Ad Hoc Networks: A Survey,” in 2012 Second International
Intelligence,” Jun. 16, 2022. https://www.researchsquare.com Conference on Advanced Computing & Communication
(accessed Jul. 21, 2022). Technologies, 2012, pp. 535–541. doi: 10.1109/ACCT.2012.48.
[208] “KHO-XAI: Krill herd optimization and Explainable Artificial [224] S. Aziz et al., “Anomaly Detection in the Internet of Vehicular
Intelligence framework for Network Intrusion Detection Systems in Networks Using Explainable Neural Networks (xNN),” Mathematics,
Industry 4.0,” Jun. 10, 2022. https://www.researchsquare.com vol. 10, no. 8, Art. no. 8, Jan. 2022, doi: 10.3390/math10081267.
(accessed Jul. 21, 2022). [225] B. Hsupeng, K.-W. Lee, T.-E. Wei, and S.-H. Wang, “Explainable
Malware Detection Using Predefined Network Flow,” in 2022 24th
International Conference on Advanced Communication Technology machine learning and model-agnostic approach,” Ecological
(ICACT), 2022, pp. 27–33. doi: Indicators, vol. 131, p. 108200, Nov. 2021, doi:
10.23919/ICACT53585.2022.9728897. 10.1016/j.ecolind.2021.108200.
[226] R. R. Prasad, R. R. Rejimol Robinson, C. Thomas, and N. [244] J. Daníelsson, R. Macrae, and A. Uthemann, “Artificial intelligence
Balakrishnan, “Evaluation of Strategic Decision taken by and systemic risk,” Journal of Banking & Finance, vol. 140, p.
Autonomous Agent using Explainable AI,” in 2021 4th International 106290, Jul. 2022, doi: 10.1016/j.jbankfin.2021.106290.
Conference on Security and Privacy (ISEA-ISAP), 2021, pp. 1–8. doi: [245] D. V. Kute, B. Pradhan, N. Shukla, and A. Alamri, “Deep Learning
10.1109/ISEA-ISAP54304.2021.9689715. and Explainable Artificial Intelligence Techniques Applied for
[227] “KDD Cup 1999 Data.” Detecting Money Laundering–A Critical Review,” IEEE Access, vol.
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (accessed 9, pp. 82300–82317, 2021, doi: 10.1109/ACCESS.2021.3086230.
Jul. 23, 2022). [246] S. Sachan, J.-B. Yang, D.-L. Xu, D. E. Benavides, and Y. Li, “An
[228] K. Amarasinghe, K. Kenney, and M. Manic, “Toward Explainable explainable AI decision-support-system to automate loan
Deep Neural Network Based Anomaly Detection,” in 2018 11th underwriting,” Expert Systems with Applications, vol. 144, p.
International Conference on Human System Interaction (HSI), Jul. 113100, Apr. 2020, doi: 10.1016/j.eswa.2019.113100.
2018, pp. 311–317. doi: 10.1109/HSI.2018.8430788. [247]L. Yang, E. M. Kenny, T. L. J. Ng, Y. Yang, B. Smyth, and R. Dong,
[229]M. Javaid and A. Haleem, “Industry 4.0 applications in medical field: “Generating Plausible Counterfactual Explanations for Deep
A brief review,” Current Medicine Research and Practice, vol. 9, no. Transformers in Financial Text Classification.” arXiv, Oct. 23, 2020.
3, pp. 102–109, May 2019, doi: 10.1016/j.cmrp.2019.04.001. doi: 10.48550/arXiv.2010.12512.
[230] L. Coventry and D. Branley, “Cybersecurity in healthcare: A [248] A. Hanif, “Towards Explainable Artificial Intelligence in Banking
narrative review of trends, threats and ways forward,” Maturitas, vol. and Financial Services.” arXiv, Dec. 14, 2021. doi:
113, pp. 48–52, Jul. 2018, doi: 10.1016/j.maturitas.2018.04.008. 10.48550/arXiv.2112.08441.
[231] D. Dave, H. Naik, S. Singhal, and P. Patel, “Explainable AI meets [249] F. Gurcan, N. E. Cagiltay, and K. Cagiltay, “Mapping Human–
Healthcare: A Study on Heart Disease Dataset.” arXiv, Nov. 06, Computer Interaction Research Themes and Trends from Its
2020. doi: 10.48550/arXiv.2011.03195. Existence to Today: A Topic Modeling-Based Review of past 60
[232] X. Li et al., “BrainGNN: Interpretable Brain Graph Neural Network Years,” International Journal of Human–Computer Interaction, vol.
for fMRI Analysis,” Medical Image Analysis, vol. 74, p. 102233, 37, no. 3, pp. 267–280, Feb. 2021, doi:
Dec. 2021, doi: 10.1016/j.media.2021.102233. 10.1080/10447318.2020.1819668.
[233] “Ada-WHIPS: explaining AdaBoost classification with applications [250] “Toward human-centered AI: a perspective from human-computer
in the health sciences | BMC Medical Informatics and Decision interaction: Interactions: Vol 26, No 4.”
Making | Full Text.” https://dl.acm.org/doi/fullHtml/10.1145/3328485 (accessed Jul. 26,
https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/ 2022).
s12911-020-01201-2 (accessed Jul. 25, 2022). [251] G. Loveleen, B. Mohan, B. S. Shikhar, J. Nz, M. Shorfuzzaman, and
[234] D. R. Chittajallu et al., “XAI-CBIR: Explainable AI System for M. Masud, “Explanation-driven HCI Model to Examine the Mini-
Content based Retrieval of Video Frames from Minimally Invasive Mental State for Alzheimer’s Disease,” ACM Trans. Multimedia
Surgery Videos,” in 2019 IEEE 16th International Symposium on Comput. Commun. Appl., Mar. 2022, doi: 10.1145/3527174.
Biomedical Imaging (ISBI 2019), Apr. 2019, pp. 66–69. doi: [252] V. Dominguez, I. Donoso-Guzmán, P. Messina, and D. Parra,
10.1109/ISBI.2019.8759428. “Algorithmic and HCI Aspects for Explaining Recommendations of
[235] K. Zhang, J. Ni, K. Yang, X. Liang, J. Ren, and X. S. Shen, Artistic Images,” ACM Trans. Interact. Intell. Syst., vol. 10, no. 4, p.
“Security and Privacy in Smart City Applications: Challenges and 30:1-30:31, Nov. 2020, doi: 10.1145/3369396.
Solutions,” IEEE Communications Magazine, vol. 55, no. 1, pp. [253] Q. V. Liao and K. R. Varshney, “Human-Centered Explainable AI
122–129, Jan. 2017, doi: 10.1109/MCOM.2017.1600267CM. (XAI): From Algorithms to User Experiences.” arXiv, Apr. 19, 2022.
[236] M. Zolanvari, Z. Yang, K. Khan, R. Jain, and N. Meskin, “TRUST Accessed: Jul. 26, 2022. [Online]. Available:
XAI: Model-Agnostic Explanations for AI With a Case Study on http://arxiv.org/abs/2110.10790
IIoT Security,” IEEE Internet of Things Journal, pp. 1–1, 2021, doi: [254] K. B. Kelarestaghi, K. Heaslip, V. Fessmann, M. Khalilikhah, and A.
10.1109/JIOT.2021.3122019. Fuentes, “Intelligent Transportation System Security: Hacked
[237] R. Stirnberg et al., “Meteorology-driven variability of air pollution Message Signs,” SAE International Journal of Transportation
(PM1) revealed with explainable machine learning,” Atmospheric Cybersecurity and Privacy, vol. 1, Jun. 2018, doi: 10.4271/11-01-02-
Chemistry and Physics, vol. 21, no. 5, pp. 3919–3948, Mar. 2021, 0004.
doi: 10.5194/acp-21-3919-2021. [255] N. Soni, R. Malekian, and A. Thakur, “Edge Computing in
[238] M. Haeffelin et al., “SIRTA, a ground-based atmospheric Transportation: Security Issues and Challenges.” arXiv, Dec. 21,
observatory for cloud and aerosol research,” Annales Geophysicae, 2020. doi: 10.48550/arXiv.2012.11206.
vol. 23, no. 2, pp. 253–275, Feb. 2005, doi: 10.5194/angeo-23-253- [256] H. Mankodiya, M. S. Obaidat, R. Gupta, and S. Tanwar, “XAI-AV:
2005. Explainable Artificial Intelligence for Trust Management in
[239] L. Monje, R. A. Carrasco, C. Rosado, and M. Sánchez-Montañés, Autonomous Vehicles,” in 2021 International Conference on
“Deep Learning XAI for Bus Passenger Forecasting: A Use Case in Communications, Computing, Cybersecurity, and Informatics
Spain,” Mathematics, vol. 10, no. 9, Art. no. 9, Jan. 2022, doi: (CCCI), 2021, pp. 1–5. doi: 10.1109/CCCI52664.2021.9583190.
10.3390/math10091428. [257] “VeReMi dataset,” VeReMi-dataset.github.io. https://veremi-
[240] G. Kostopoulos, T. Panagiotakopoulos, S. Kotsiantis, C. Pierrakeas, dataset.github.io/ (accessed Jul. 26, 2022).
and A. Kameas, “Interpretable Models for Early Prediction of [258] S. Shams Amiri, S. Mottahedi, E. R. Lee, and S. Hoque, “Peeking
Certification in MOOCs: A Case Study on a MOOC for Smart City inside the black-box: Explainable machine learning applied to
Professionals,” IEEE Access, vol. 9, pp. 165881–165891, 2021, doi: household transportation energy consumption,” Computers,
10.1109/ACCESS.2021.3134787. Environment and Urban Systems, vol. 88, p. 101647, Jul. 2021, doi:
[241] Y. Feng, D. Wang, Y. Yin, Z. Li, and Z. Hu, “An XGBoost-based 10.1016/j.compenvurbsys.2021.101647.
casualty prediction method for terrorist attacks,” Complex Intell. [259] C. Bustos et al., “Explainable, automated urban interventions to
Syst., vol. 6, no. 3, pp. 721–740, Oct. 2020, doi: 10.1007/s40747- improve pedestrian and vehicle safety,” Transportation Research
020-00173-0. Part C: Emerging Technologies, vol. 125, p. 103018, Apr. 2021, doi:
[242] M. C. Garrido, J. M. Cadenas, A. Bueno-Crespo, R. Martínez- 10.1016/j.trc.2021.103018.
España, J. G. Giménez, and J. M. Cecilia, “Evaporation Forecasting [260] A. Kuppa and N.-A. Le-Khac, “Black Box Attacks on Explainable
through Interpretable Data Analysis Techniques,” Electronics, vol. Artificial Intelligence(XAI) methods in Cyber Security,” in 2020
11, no. 4, Art. no. 4, Jan. 2022, doi: 10.3390/electronics11040536. International Joint Conference on Neural Networks (IJCNN), Jul.
[243] C. M. Viana, M. Santos, D. Freire, P. Abrantes, and J. Rocha, 2020, pp. 1–8. doi: 10.1109/IJCNN48605.2020.9206780.
“Evaluation of the factors explaining the use of agricultural land: A
[261] T.-T.-H. Le, H. Kang, and H. Kim, “Robust Adversarial Attack
Against Explainable Deep Classification Models Based on
Adversarial Images With Different Patch Sizes and Perturbation HUSSAM AL HAMADI (Senior Member,
Ratios,” IEEE Access, vol. 9, pp. 133049–133061, 2021, doi: IEEE) studied computer engineering at
10.1109/ACCESS.2021.3115764. Ajman University where he graduated in
[262] H. Ali, M. S. Khan, A. Al-Fuqaha, and J. Qadir, “Tamp-X: 2005. He spent the period between 2005
Attacking explainable natural language classifiers through tampered and 2010 working as a computer
activations,” Computers & Security, vol. 120, p. 102791, Sep. 2022, consultant and tutor in several
doi: 10.1016/j.cose.2022.102791. governmental and private institutions to
[263] D. Slack, S. Hilgard, E. Jia, S. Singh, and H. Lakkaraju, “Fooling eventually joined the Khalifa University as
LIME and SHAP: Adversarial Attacks on Post hoc Explanation a teaching assistant in 2010. He holds
Methods,” in Proceedings of the AAAI/ACM Conference on AI, several international certificates in
Ethics, and Society, New York, NY, USA: Association for networking, business, and tutoring, like
Computing Machinery, 2020, pp. 180–186. Accessed: Jul. 27, 2022. MCSA, MCSE, CCNA, CBP, and CTP. In
[Online]. Available: https://doi.org/10.1145/3375627.3375830 2017, he received his Ph.D. degree in
[264] J. L. Mattu Julia Angwin,Lauren Kirchner,Surya, “How We computer engineering from Khalifa
Analyzed the COMPAS Recidivism Algorithm,” ProPublica. University, where he is currently a research scientist in their Center for
https://www.propublica.org/article/how-we-analyzed-the-compas- Cyber-Physical Systems (C2PS). His research interests focus on
recidivism- applied security protocols for several systems like; software agents,
algorithm?token=BqO_ITYNAKmQwhj7daSusnn7aJDGaTWE SCADA, e-health systems and autonomous vehicles.
(accessed Jul. 27, 2022).
[265] J. Wang, J. Tuyls, E. Wallace, and S. Singh, “Gradient-based
Analysis of NLP Models is Manipulable.” arXiv, Oct. 11, 2020. doi:
ERNESTO DAMIANI (Senior Member,
10.48550/arXiv.2010.05419.
IEEE) is currently a Full Professor with the
[266] G. K. Dziugaite, Z. Ghahramani, and D. M. Roy, “A study of the
Universitàdegli Studi di Milano, Italy, the
effect of JPG compression on adversarial images.” arXiv, Aug. 02,
Senior Director of the Robotics and
2016. doi: 10.48550/arXiv.1608.00853.
Intelligent Systems Institute, and the
[267] J. Gao, B. Wang, Z. Lin, W. Xu, and Y. Qi, “DeepCloak: Masking
Director of the Center for Cyber Physical
Deep Neural Network Models for Robustness Against Adversarial
Systems (C2PS), Khalifa University,
Samples.” arXiv, Apr. 17, 2017. doi: 10.48550/arXiv.1702.06763.
United Arab Emirates. He is also the
[268] P. Samangouei, M. Kabkab, and R. Chellappa, “Defense-GAN:
Leader of the Big Data Area, Etisalat
Protecting Classifiers Against Adversarial Attacks Using Generative
British Telecom Innovation Center (EBTIC)
Models.” arXiv, May 17, 2018. doi: 10.48550/arXiv.1805.06605.
and the President of the Consortium of
[269] D. Becking, M. Dreyer, W. Samek, K. Müller, and S. Lapuschkin,
Italian Computer Science Universities
“ECQ$$^{\text {x}}$$: Explainability-Driven Quantization
(CINI). He is also part of the ENISA Ad-
for Low-Bit and Sparse DNNs,” in xxAI - Beyond Explainable AI:
Hoc Working Group on Artificial Intelligence Cybersecurity. He has
International Workshop, Held in Conjunction with ICML 2020, July
pioneered model-driven data analytics. He has authored more than 650
18, 2020, Vienna, Austria, Revised and Extended Papers, A.
Scopus-indexed publications and several patents. His research interests
Holzinger, R. Goebel, R. Fong, T. Moon, K.-R. Müller, and W.
include cyber-physical systems, big data analytics, edge/cloud security and
Samek, Eds. Cham: Springer International Publishing, 2022, pp.
performance, artificial intelligence, and Machine Learning. He was a
271–296. doi: 10.1007/978-3-031-04083-2_14.
recipient of the Research and Innovation Award from the IEEE Technical
[270] W. Ha, C. Singh, F. Lanusse, S. Upadhyayula, and B. Yu, “Adaptive
Committee on Homeland Security, the Stephen Yau Award from the
wavelet distillation from neural networks through interpretations,” in
Service Society, the Outstanding Contributions Award from IFIP TC2, the
Advances in Neural Information Processing Systems, 2021, vol. 34,
Chester-Sall Award from IEEE IES, the IEEE TCHS Research and
pp. 20669–20682. Accessed: Jul. 29, 2022. [Online]. Available:
Innovation Award, and a Doctorate Honoris Causa from INSA-Lyon,
https://proceedings.neurips.cc/paper/2021/hash/acaa23f71f963e96c8
France, for his contribution to big data teaching and research.
847585e71352d6-Abstract.html

CHAN YEOB YEUN (Senior Member, IEEE)


received the M.Sc. and Ph.D. degrees in
information security from the Royal Holloway,
University of London, in 1996 and 2000,
respectively. After his Ph.D. degree, he joined
Toshiba TRL, Bristol, U.K., and later became
the Vice President at the Mobile Handset
ZHIBO ZHANG received the Bachelor of Research and Development Center, LG
Science degree in mechatronics engineering Electronics, Seoul, South Korea, in 2005. He
from Northwestern Polytechnical University, was responsible for developing mobile TV
China, in 2021. He is currently pursuing a technologies and related security. He left LG
master’s degree in electrical and computer Electronics, in 2007, and joined ICU (merged
engineering at Khalifa University, United with KAIST), South Korea, until August 2008,
Arab Emirates. His research interests focus and then the Khalifa University of Science and
on computer vision, cyber security, Technology, in September 2008. He is currently a Researcher in
Explainable Artificial Intelligence, and cybersecurity, including the IoT/USN security, cyber-physical system
Trustworthy Artificial Intelligence. security, cloud/fog security, and cryptographic techniques, an Associate
Professor with the Department of Electrical Engineering and Computer
Science, and the Cybersecurity Leader of the Center for Cyber-Physical
Systems (C2PS). He also enjoys lecturing for M.Sc. cyber security and
Ph.D. engineering courses at Khalifa University. He has published more
than 140 journal articles and conference papers, nine book chapters, and
ten international patent applications. He also works on the editorial board
of multiple international journals and on the steering committee of
international conferences.

FATMA TAHER (Senior Member,


IEEE) received the Ph.D. degree from
the Khalifa University of Science,
Technology and Research, United Arab
Emirates, in 2014. She is currently the
Assistant Dean of the College of
Technological Innovation, Zayed
University, Dubai, United Arab Emirates.
She has published more than 40 articles
in international journals and conferences.
Her research interests are in the areas of
signal and image processing, pattern
recognition, Deep Learning, Machine
Learning, artificial intelligence, medical image analysis, especially in
detecting of the cancerous cells, kidney transplant, and autism. In addition
to that, her researches are watermarking, remote sensing, and satellite
images. She served as a member of the steering, organizing, and technical
program committees of many international conferences. She has received
many distinguished awards, such as the Best Paper Award of the first prize
in the Ph.D. Forum of the 20th IEEE International Conference on
Electronics, Circuits, and Systems (ICECS), the Ph.D. Forum, December
2013. And recently, she received the UAE Pioneers Award as the first
UAE to create a computer-aided diagnosis system for early lung cancer
detection based on the sputum color image analysis, awarded by H. H.
Sheik Mohammed Bin Rashed Al Maktoum, November 2015. In addition
to that, she received the Innovation Award at the 2016 Emirati Women
Awards by H. H. Sheik Ahmed Bin Saeed Al Maktoum. She was the
Chairman of Civil Aviation Authority and a Patron of Dubai Quality
Group and L’Oréal-UNESCO for Women in Science Middle East
Fellowship 2017. She is the Vice Chair of the IEEE UAE section and the
Chair of the Education Committee in British Society, United Arab
Emirates. She has served on many editorial and reviewing boards of
international journals and conferences.

You might also like