Machine Learning Attacks
Machine Learning Attacks
Abstract— Machine learning models have made many deci- two distinct phases. The training data which is fed into the
sion support systems to be faster, more accurate and more learning algorithm during the training phase, and the new
efficient. However, applications of machine learning in network or test data which is fed into the learned model during the
security face more disproportionate threat of active adversarial
attacks compared to other domains. This is because machine prediction phase. If the attacker can manipulate the input data
learning applications in network security such as malware in either phase, it is possible to induce a wrong prediction
detection, intrusion detection, and spam filtering are by them- from the machine learning model.
selves adversarial in nature. In what could be considered In this survey, we provide a brief introduction to machine
an arm’s race between attackers and defenders, adversaries learning using a three-dimensional classification method. We
constantly probe machine learning systems with inputs which
are explicitly designed to bypass the system and induce a classify the various machine learning approaches based on
wrong prediction. In this survey, we first provide a taxonomy the learning tasks, learning techniques and learning depth.
of machine learning techniques, tasks, and depth. We then We further organize the various applications of machine
introduce a classification of machine learning in network secu- learning in network security based on a taxonomy of security
rity applications. Next, we examine various adversarial attacks tasks. Contrary to the survey by Corona et al. [5], our
against machine learning in network security and introduce
two classification approaches for adversarial attacks in network work focuses on adversarial attacks that are strictly machine
security. First, we classify adversarial attacks in network learning based. Next, we classify the various adversarial
security based on a taxonomy of network security applications. attacks based on the applications in network security. We
Secondly, we categorize adversarial attacks in network security identify five main categories of machine learning applications
into a problem space vs. feature space dimensional classification in network security for our classification method. Finally, we
model. We then analyze the various defenses against adversarial
attacks on machine learning-based network security applica- classify adversarial attacks against machine learning based
tions. We conclude by introducing an adversarial risk grid on a taxonomy of network security applications.
map and evaluate several existing adversarial attacks against Our contribution is threefold. First, we introduce a new
machine learning in network security using the risk grid map. method for classifying adversarial attacks in network security
We also identify where each attack classification resides within based on a taxonomy of network security applications. We
the adversarial risk grid map.
also introduce the concept of problem space and feature
Keywords: Machine Learning, Adversarial samples, Net-
space dimensional classification of adversarial attacks in
work security
network security.
I. I NTRODUCTION Secondly, we introduce the concept of adversarial risk
There has been an ever-increasing application of machine in computer and network security. We provide a new risk
learning and deep learning techniques in network security. mapping for evaluating the risk of adversarial attacks in
One key advantage of machine learning is that it makes network security based on the discriminative or directive
optimal decisions more feasible. autonomy of the machine learning tasks and techniques
It, however, introduces a new challenge since security and respectively.
robustness of these models is usually not a huge consid- Lastly, we evaluate several adversarial attacks against ma-
eration for machine learning algorithm designers who are chine learning in network security applications as proposed
more focused on designing effective and efficient models. by various researchers and classify the attacks based on an
This creates room for various forms of attack models against adversarial threat attack taxonomy shown in Table I.
machine learning-based network security applications. As we outline in Section II, prior adversarial attacks
Researchers [1][2][3][4] have shown that the presence surveys [6][7][8] mainly covered them in the computer vi-
of adversarial samples can easily fool machine learning sion domain. Nevertheless, some surveys tackled adversarial
systems. Adversarial samples are specially crafted inputs that attacks on cybersecurity [9][10][11][12], but to the best of
cause a machine learning model to classify an input wrongly. our knowledge, there is currently no prior work that has
Machine learning systems typically take in input data in reviewed adversarial attacks in network security based on
a classification of network security applications. No prior
work has also reviewed the concept of problem space vs.
feature space dimensional classification of adversarial attacks [14] especially in areas such as intrusion detection [15]
in network security. Also, this is the first work to propose and malware detection [16] where there have been rapid
an adversarial machine learning risk grid map in the field progress in the adoption of machine learning for such tasks.
of network security based on the directive or discriminative Even though adversarial machine learning has recently
autonomy of the machine learning algorithms. been widely researched in network security, to the best of
our knowledge, there is currently no publication that has
surveyed the vast number of growing research work on
adversarial machine learning in this field. Some existing
survey papers we reviewed include Akhtar et al. [17]
which reviewed adversarial attacks against deep learning
in computer vision. Qui et al [18] provided a generalized
survey on adversarial attacks in artificial intelligence, with
a brief discussion on cloud security, malware detection
and intrusion detection. Liu et al. [19] reviewed security
threats and corresponding defensive techniques of machine
learning focusing on the threats in the learning algorithms.
Rosenberg et al. [9] provided a general review on adversarial
attacks on cyber security domains like; Intrusion detection
systems, URL Detection systems, Biometric Systems, CPSs
(Cyber-Physical Systems), and Industrial Control Systems.
Unlike their work, our review only concentrates on network
security and uses different approaches to classify adversarial
attacks and defenses. Duddu el al. in [10] discussed
various research work on adversarial machine learning in
cyberwarfare, with some mention of adversarial attacks
against malware classifiers. Zhang et al. [11] discussed
adversarial attacks as a limitation of deep learning in
mobile and wireless networking but did not consider deep
learning in the context of network security applications.
Fig. 1. Structure of the Paper Buczak et al. [20] in their survey on machine learning-based
cybersecurity intrusion detection focused on complexity
As illustrated in Figure 1, We structure the remainder and challenges of machine learning in cybersecurity but
of the paper as follows. In Section II, we survey some did not review adversarial attacks in their study. Biggio
related work. In section III, we discuss some applications of and Roli [12] provided an historical timeline of adversarial
machine learning in network security. In section IV, we begin machine learning in the context of computer vision and
with a brief background about adversarial machine learning cybersecurity but their work did not provide a detailed
followed by a description of our adversarial attack taxonomy. review in the context of network security. Gardiner et al.
We also review different adversarial attack methods and [21] in their survey on the security of machine learning
algorithms. In section V, we introduce a classification method in malware detection, focused on reviewing the Call and
for adversarial attacks in network security based on the Control (C & C) detection techniques. They also identified
network security CIA goals of confidentiality, integrity and the weaknesses and explained the limitations of secure
availability. In section VI, we discuss and evaluate adver- machine learning algorithms in malware detection systems.
sarial risk in machine learning. In section VII, we review Domain specific surveys on adversarial machine learning
various approaches for defending against adversarial attacks. has also been published including Hao et al. [22] in which
In section VIII, we provide some discussion and lessons various adversarial attacks and defenses in images, graphs
learnt. Finally, in section IX, we add a conclusion for our and texts were reviewed. In the field of natural language
survey with guidance for future work. processing, zhang et.al [23] reviewed various publications in
which deep adversarial attacks and defenses were proposed.
II. R ELATED W ORK
Sun et al. [24] published a survey on adversarial machine
Adversarial attacks have been widely studied in the field learning in graph data. Akhtar et al. [17] computer vision,
of computer vision [6][7][8] with several attack methods Duddu et al. [10] cyber warfare.
and techniques developed mostly for image recognition
tasks. Researchers have discussed the public safety concern Research Gap With growing interest in the use of ma-
of adversarial attacks such as in self-driving cars which chine learning for network security applications, the signif-
could be fooled into mis-classifying a stop sign resulting icance of adversarial attacks against such machine learning-
in a potentially fatal outcome [13]. In network security, the based application have become more prevalent. With con-
consequences of adversarial attacks are equally significant tinued increase in the amount of work in this field, there
2
have been recent attempts to review these publications into Stream processing and Batch processing (Alarm His-
a survey work. In the field of network security, We identified tory). Machine learning model trained offline and used
nine survey papers which attempt to discuss adversarial for verification service that can immediately classify
machine learning from the context of network security. None true or false alarms. They used different machine learn-
of these previous survey papers have however explored the ing algorithms in the experiments to show the effective-
vast amount of research work currently ongoing on the topic ness of their system where the accuracy achieves more
of adversarial machine learning in network security in a than 90% in a stream of 30K alarms per second [28].
manner that categorizes them based on security applications, • Learning Intrusion Detection Laskov et al. [29] worked
problem and feature space dimensional classification and in developing a framework to compare the super-
adversarial risk grid map. vised learning (classification) and unsupervised learn-
Our survey more importantly seeks to distinguish between ing (clustering) techniques for detecting intrusions and
adversarial attacks in general, and adversarial machine learn- malicious. They used different methods in supervised
ing in context. We note that an adversary may seek to learning to evaluate the work include k-Nearest Neigh-
compromise network security applications in various ways bor (kNN), decision trees, Support Vector Machines
and this may not be related to adversarial machine learning. (SVM) and Multi-Layer Perception (MLP). Also, k-
For example in [5] where adversarial attacks in Intrusion means clustering was utilized, with single linkage clus-
detection systems was reviewed. In our context, adversar- tering as unsupervised algorithms. The evaluation was
ial machine learning specifically addresses the optimization ran under two scenarios to evaluate how much the
problem in which a machine learning based network security IDS could generalize its knowledge to new malicious
solution is being attacked. Many network security solutions activities. The supervised algorithms showed better
are strictly rules based or hard programming dependent and classification with the known attacks. The best result
do not implement machine learning techniques. Our survey among the supervised algorithm was the decision tree
work does not refer to such adversarial attacks, since they do algorithm whiched achieved 95% true positive and 1%
not capture the real context of adversarial machine learning false positive rate, followed by MLP, SVM and then
in principle. KNN. If there were new attacks not previously seen in
the training data, the accuracy decreases significantly.
III. A PPLICATIONS OF M ACHINE L EARNING IN However, the unsupervised algorithms performed better
N ETWORK S ECURITY for unseen attacks and did not show significant differ-
Today’s network as well as next generation network archi- ence in accuracy for seen and unseen attacks [29].
tectures have become quite complex, and new innovations
of network security solutions are required to protect against B. Machine Learning for Endpoint Protection
the growing landscape of cyber threats. Machine learning Malware detection is a significant part of endpoint security
techniques have been increasingly used to carry out a wide including workstations, servers, cloud instances, and mobile
range of tasks in network security [25] incorporating several devices. Malware detection is used to detect and identify
layers of defenses both within the network and at the edge malicious activities caused by malware. With the increase in
of the network. In this section, we review and highlight the variety of malware activities, the need for automatic de-
some applications of machine learning in network security by tection and classifier amplifies as well. The signature-based
classifying them into five categories as illustrated in Figure malware detection system is commonly used for existing
2. malware that has a signature but it not suitable for unknown
malware or zero-day malware. Machine learning can cope
A. Machine Learning for Network Protection with this increase and discover underlying patterns in large-
Intrusion Detection Systems (IDS) are essential solutions scale datasets [30].
for monitoring events dynamically in a computer network • Automatic Analysis of Malware Behavior Rieck et al.
or system. Essentially there are two types of IDS (signature [31] successfully proposed a framework for analyzing
based and anomaly based) [26]. Signature based IDS detects malware behavior automatically using various machine
attacks based on the repository of attacks signatures with no learning techniques. The framework allows clustering
false alarm [27]. However, zero-day attacks can easily by- similar malware behaviors into classes and assigns new
pass signature-based IDS. Anomaly IDS [27] uses machine malware to these discovered classes. They designed an
learning and can detect a new type of attacks and anomalies. incremental approach for the behavior analysis that can
A typical disadvantage of anomaly IDS is the tendency to process various malware behaviors and reduce the run-
generate a significant number of false positive alarms. time defense against malware development comparing
• Hybrid Approach for Alarm Verification Sima et al. [28] to other analysis methods and provide accurate dis-
designed and built Hybrid Alarm Verification System covery of novel malware. To implement this automatic
that requires processing a significant number of real- framework, they collected a large number of malware
time alarms, high accuracy in classifying false alarms, samples and monitored their behaviors using a sandbox
perform historical data analysis. The proposed sys- environment and learn those behaviors using Clustering
tem consists of three components: Machine Learning, and Classification algorithms [31].
3
Fig. 2. Machine Learning Applications in Network Security
• Automated Multi-level Malware Detection System for most of the malware families [30].
In [32], authors proposed Advanced Virtual Ma- • A Hybrid Malicious Code Detection Method Li et
chine Monitor-based guest-assisted Automated Multi- al. [33] proposed a hybrid malicious code detection
level Malware Detection System (AMMDS) that affect scheme based on AutoEncoder and Deep Belief Net-
both Virtual Machine Introspection (VMI) and Memory works (DBN). They used the AutoEncoder to reduce the
Forensic Analysis (MFA) techniques to mitigate in real dimensionality of data by extracting the main features.
time symptoms of stealthily hidden processes on guest Then they used the DBN that composed multilayer
OS [32]. They use different machine learning techniques Restricted Boltzmann Machines (RBM) and a layer of
such as Logistic Regression, Random Forest, Naive BP neural network to detect malicious code. The BP
Bayes, Random Tree, Sequential Minimal Optimization neural network has an input vector from the last layer
(SMO), and J48 to evaluate the AMMDS and the results of RBM based on unsupervised learning and then use
achieve 100%. supervised learning in the BP neural network. They
• Classification of Malware System Call Sequences achieved the Optimal hybrid model. The experiment
Kolosnjaji et al. [30] focused on the utilization of neural results that are verified by KDDCUP’99 dataset show
networks by stacking layers according to deep learning higher accuracy compared to a single DBN and reduce
to improve the classification of newly retrieved malware the time complexity [33].
samples into a predefined set of malware classes. They
constructed Convolutional Neural Network (CNN) and C. Machine Learning for Application Security
Recurrent Neural Network (RNN) layers for modeling Various machine learning tasks used for application se-
System Call Sequences. The sequences used by the curity including malicious web attack detection, phishing
CNN layers was based on a set of n-grams. The detection and spam detection.
presence of the n-grams and their relation were counted
in a behavioral trace. The RNN on the other hand used • Detection of Phishing Attacks Basnet et al. [34] studied
sequential information to train the model. A dependence and compared the effectiveness of using different ma-
between the system call appearance and the system call chine learning algorithms for classification of phishing
sequence was however maintained. If this model was emails using many novel input features that helps in de-
trained properly, it usually provided better accuracy on tecting phishing attacks. The training dataset is labeled
subsequent data and most often captured more training with phishing or legitimate email. They used unsuper-
set information. This deep learning technique for cap- vised learning to extract features without prior training
turing the relation between the n-grams in the system directly and provides fast and reliable knowledge from
call sequences was deemed to be relatively efficient as the dataset. They used 4000 emails in total, A total of
it achieved 90% average accuracy, precision and recall 2000 emails used for testing. They used Support Vector
Machines (SVM), Leave One Model Out, Biased SVM,
4
Neural Networks, Self Organizing Maps (SOMs) and system and a Multi-Layer Perception Neural Network
K-Means on the dataset. Consistently, Support Vector (MLPNN) showed that the PNN was four times faster.
Machine achieved the best results. The Biased Support • Text-based CAPTCHA Strengths and Weaknesses
Vector Machine (BSVM) and NN have an accuracy of Bursztein et al. [38] in a study showed that several
97.99% [34]. well known websites still implemented technologies that
• Adaptively Detecting Malicious Queries in Web Attacks have been proven to be vulnerable to cyber attacks. In
Don et al. [35] proposed a new system called AMODS the study, an automated Decaptcha tool was tested on
and learning strategy called SVM HYBRID for de- numerous websites including well known names such as
tecting web attacks. AMODS is an adaptive system eBay, Google and Wikipedia. It was observed that 13
that aims to periodically update the detection model to out of 15 widely used web technologies were vulnera-
detect the latest web attacks. The SVM HYBRID is ble to their automated attack. They had a significant
an adaptive learning strategy which was implemented success rate for most of the websites. Only Google
primarily for reducing manual work. The detection and Recaptacha were able to resist to the automated
model was trained using dataset which was obtained attack. Their study revealed the need for more robust
from an academic institute’s web server logs. The CAPTCHA designs in most of the widely used schemes.
proposed detection model outperformed existing web Authors recommended that the schemes should not
attack detection methods with an FP rate of 0.09% and rely on segmentation alone because it did not provide
94.79% F-value. The SVM Hybrid system obtained a sufficient defense against automated attacks.
total number of malicious queries equal to 2.78 times • Social Network Spam Detection K. Lee et al. [39]
by the popular SVM method. Also, the Web Application proposed social network spam detection that gathers
Firewall (WAF) can use malicious queries to update the legitimate and spam profiles and feeds them to Support
signature library. The significant queries were used for Vector Machine (SVM) model. The authors selected
updating the detection model which consisted of a meta- two social networks: Twitter and MySpace to evaluate
classifier as well as other three base classifiers [35]. the proposed machine learning system. They collected
• URLNet -Learning a URL Representation with Deep data over months and feed them to the SVM classifier.
Learning for Malicious URL Detection Le et al. [36] The dataset contains 388 legitimate profiles and 627
proposed an end-to-end deep learning framework which spam profiles collected from MySpace, and 104 legiti-
did not require sophisticated feature. URLNet was in- mate profiles and 168 profiles between promoters and
troduced to address several limitations which was found spammers collected from Twitter. The system achieved
with the other model approaches. This framework learns a low false positive rate and high precision up to 70%
from the URL directly how to perform a nonlinear for MySpace and 82% for Twitter.
URL embedding which then enabled it to successfully
E. Machine Learning for Process Behavior Analytic
detect various Malicious URLs. Convolutional Neural
Networks (CNN) were applied to both the characters Machine learning applications usually necessitate the need
and words of each URL to discover the URL embedding to learn and have some domain knowledge about business
method. They also proposed advanced word-embedding process behaviors in order to detect anomalous behaviors.
techniques to deal with uncommon words, which was a Machine learning could be used for determining fraudulent
limitation being experienced by other malicious URL transactions within banking systems. Also it has been suc-
detection systems. The framework then learns from cessfully used for identifying outliers, classifying types of
unknown works at testing phase [36]. fraud and for clustering various business processes.
• Anomaly detection in Industrial Control Systems
D. Machine Learning for User Behavior Analytic
Kravch et al.[40] performed a successful study on
User behavior analytics is a cybersecurity process which SecureWater Treatment Testeb (SWat) using Deep Con-
involves analyzing patterns in human behaviors and detecting volutional Neural Networks CNN to detect most of
anomalies that give an indication of fraudulent activities attacks on Industrial Control System (ICS) with a low
or insider threats. Machine learning algorithms are used to false positive. The anomaly detection method was based
detect such anomalies in user actions such as unusual login on the statistical deviation measurement of the predicted
tries and to infer useful knowledge from those patterns. value. They performed the study using 36 different
• Authentication with Keystroke Dynamics Revett et al. attacks from SWat. The authors in [40] proofed that
[37] proposed a system using Probabilistic Neural Net- using 1D convolutional networks in anomaly detection
work (PNN) for keystroke dynamics that captures the in ICS outperformed the recurrent networks.
typing style of a user. A system comprising of 50 user • Detecting Credit Card Fraud Traditionally, the Fraud
login credential keystrokes was evaluated. The authors Detection System uses old transactions data to predict a
[37] used eight attributes to monitor the enrollment new transaction. Fraud Detection System (FDS) should
and authentication attempts. An accuracy of 90% was encounter various potential challenges and difficulties
obtained in classifying legitimate users from imposters. to achieve high accuracy and performance [41]. The
A comparison of the training time between the PNN traditional detection method does not solve all problems
5
Fig. 3. Adversarial Machine Learning
and challenges including imbalanced data where there confuse a machine learning model into making a wrong
is a small chance of transactions are fraudulent. Wrong decision. The adversary achieves this by modifying the input
classification and overlapping data and Fraud detection data that is fed to the machine learning model either during
cost are other major challenges [41]. Chen et al. [42] the training phase (poisoning attack) [49] or during the
proposed an approach to solving the listed challenges inference phase (evasion attack) [50].
and problems for Credit Card fraud. They introduced a The reason behind adversarial examples has been linked to
system to prevent fraud from the initial use of credit the fact that most machine learning models remain overtly
cards by collecting user data from online questionnaire attached to the superficial statistics of the input data [51]
based on consumer behavior surveys. They used var- [52] . This attachment to the input data makes the machine
ious classifiers models: decision tree (C5.0, CandRT, learning highly sensitive to distribution shift, resulting in a
CHAID) and SVM ( linear and radial basis, Kernels disparity between semantic changes and a decision change
of polynomial, sigmoid). They use three datasets to de- [1].
velop questionnaire-responded transaction (QRT) model We consider the security model for use of machine learn-
to predict new transaction. ing in network security as a combination of four components
• Deep Learning Techniques for Side-Channel Analysis namely the attack surface, threat model, adversarial frame-
Prouff et al. [43] defined Side-Channel Analysis as a work and adversarial risk. An alternative adversarial model
type of attack that attempts to leak information from a was proposed in [53] which modeled the adversary using a
system by exploiting some parameters from the physical threefold approach based on knowledge, goals and capabil-
environment [43]. This attack was utilizing the running- ity. The attack surface identifies the various attack vectors
time of some cryptographic computation, especially in along a typical machine learning data processing pipeline
the block ciphers. The capability of a system to resist in network security related applications. The threat model
side-channel attacks (SCA) requires an evaluation strat- provides a system abstraction for profiling the adversary’s
egy that focuses on deducing the relationship between capabilities and the potential threats that are associated. The
the device behavior and the sensitivity of the infor- adversarial framework details our approach for classifying
mation that is common in classical cryptography. The the various attacks and defenses within each network security
authors in [43] focused on proposing an extensive study domain and lastly the adversarial risk provides an evaluation
of using deep learning algorithms in the Side-Channel of the likelihood and severity of adversarial attacks within a
Analysis. Also, they focused on the hyper-parameters network security system.
selection to help in designing new deep learning classi- A major component of an adversarial attack is the ad-
fier and models. They confirmed that the Convolutional versarial sample. As illustrated in Figure 3, an adversarial
Neural Networks (CNN) models are better in detecting sample consists of an input to a machine learning model
SCA. Their proposal system outperformed the other which has been perturbed. For a particular dataset with
tested models on highly desynchronized traces and had features x and label y, a corresponding adversarial sample
the best performance as well on small desynchronized is a specific data point x’ which causes a classifier c to
trace [43]. predict a different label on x’ other than y, but x’ is
almost indistinguishable from x. The adversarial samples
IV. A DVERSARIAL M ACHINE L EARNING are created using one of many optimization methods known
as adversarial attack methods. Crafting adversarial samples
Adversarial attacks have been studied for more than a
involves solving an optimization problem to determine the
decade now [12]. However, the first notable discovery in
minimum perturbation which maximizes the loss for the
adversarial attacks for computer vision was by Szegedy et
neural network
al. [44] who reported that a small perturbation in the form
Considering an input x, and a classifier f, the optimization
of a carefully crafted input could confuse a deep neural
goal for the adversary is to compute such perturbation with a
network to misclassify an image object. Other researchers
s mall norm, measured w.r.t some distance metric, that would
have demonstrated the use of adversarial attacks beyond
modify the output of the classifier such that
image classification [45][46][47][48].
In adversarial machine learning, an adversary seeks to f (x + δ) 6= f (x)
6
where δ is the perturbation. If δ is applied to all of the input at least have some specific information, for example
data (all of the image’s pixels, for example), it is considered the location of the model before it can attack the
a dense adversarial attack. However, if just partial positions model. The severity of blackbox attacks poses a greater
are perturbed, it is called a sparse adversarial attack [54]. threat in practice. The model for real-world systems
Adversarial machine learning in network security is typ- may be more restrictive than a theoretical black-box
ically an arms race between two agents. The first agent is model where the adversary can understand the full
an adversary whose objective is to intrude a network with a output of the neural network on inputs that have been
malicious payload. The other agent is one whose role is to chosen arbitrarily. In [69], an analysis of three threat
protect the network from the consequences of the malicious models were proposed. These models, defined as, the
payload. query-limited setting, the partial information setting,
We start with a view of the different type of data that and the label-only setting, provide a more accurate
traverses a network during any given time. characterization of real-world classifiers. As such, a
representation of black box adversarial attacks was
A. Adversarial Attack Taxonomy
proposed, such that, it would be possible to fool
We examine the Adversarial Attack Taxonomy in Table I classifiers under these more restrictive threat models,
to consider the goals and capabilities of any adversary for whereas, it might have been impractical or ineffective.
a machine learning system. We base our threat framework
from the original model in [8] [53] and adapt it within
the context of adversarial attacks in network security
2) Space: In the field of adversarial machine learning, the
domain. Within this context, adversarial attack threats in
input space can be defined as a dimensional representation of
network security may be considered based on the attacker’s
all the possible configurations of the objects in determination
knowledge, attack space, attacker’s strategy, attacker’s goal
context. We categorize this as Feature Space and Problem
and attack target. As mentioned in section I, to the best of
Space.
our knowledge, this is the first review to add the idea of the
space dimension in the classification of adversarial attacks • Feature space modeling of an adversarial sample is a
in network security. method in which an optimization algorithm is used to
find the ideal value out of a finite number of arbitrary
changes made to the features. In a feature space adver-
1) Knowledge: The knowledge component of the ad- sarial attack, the attacker’s objective is to remain benign
versarial threat model describes the extent to which the without generating a new instance. Conversely, a feature
adversary knows about the machine system as a whole. This space is defined as the n dimensional space in which all
could be classified as White-box, Gray-box or Black-box variables in the input dataset are represented. We take as
attacks. an example an intrusion detection dataset with 70 vari-
• In white-box attacks, it is assumed that the attacker has
ables, this represents a 70-dimensional feature space. A
complete knowledge of the training data, the learning feature space adversarial attack in the context above will
algorithm, the learned model as well as the parameters seek to alter the feature space by making changes within
which were used while training model. A white-box the 70-dimensional feature space. A feature space attack
attack represents an adversary who has the exact in- modifies the features in the instance directly. Using an
formation that is held by the owner or creator of the example of malware adversarial attacks, a feature space
machine learning system which is being under attack. adversarial malware attack will only modify the feature
In the majority of real world adversarial attack settings, vectors but no new malware is created.
this is usually not feasible. • The problem space refers to an input space in which
• A Gray-box attacks assumes a more realistic approach,
the objects e.g. image, file, etc. resides. A problem
and considers that there could be varying degrees infor- space adversarial malware attack will modify the actual
mation accessible to the adversary [57]. For example, an instance from the source to produce a new instance
adversary may have partial information about the model of the malware. Typically, a problem space adversarial
queries, or limited access to the training data. For a attack tends to generate new objects in domains such
gray-box attack, the adversary does not have the exact as malware detection whereby there is no clear inverse
knowledge which the creator of the model possesses, but mapping to the feature space [58]. A typical difference
has sufficient information to attack the machine learning between a problem space adversarial attack, and a
system to cause the machine learning system to fail. feature space adversarial attack is that a feature space
• A black-box attack assumes that the adversary is totally
attack does not generate a new sample but only creates a
unaware of the machine learning system. in this type new feature vector. A problem space adversarial attack
of attack, the adversary has no knowledge about either modifies the actual instance itself to create an entirely
the learning algorithm or the learned model. It may new object.
be argued that a truly black-box attack is impossible. 3) Strategy: Attacker’s strategy implies the phases of
this is because it is assumed that the adversary must operation in which the adversary launches the attack. Three
7
TABLE I
A DVERSARIAL ATTACK TAXONOMY
Types References
Black Box [7] [55]
Knowledge White Box [56] [1]
Gray Box [57]
Feature Space [15]
Space
Problem Space [58]
Evasion [57] [59]
Strategy Poisoning [60] [61] [49]
Oracle [62] [60]
Availability [63] [64]
Goal Integrity [63] [65]
Confidentiality [63] [66]
Physical Domain [67] [68]
Target
ML Model [1]
main strategies which an adversary may use in adversarial reinforcement learning usually demand one-shot attacks
attacks are Evasion, Poisoning and Oracle. as the only feasible approach [59].
Gradient-free attacks [57], unlike gradient-based attacks
• Evasion attacks, also known as exploratory attack or
do not require knowledge of the model. Gradient-free
attack at decision time, during the testing or inference
attacks can generate potent attacks against a machine
phase. The attacker aims to confuse the decision of the
learning model with knowledge of only the confidence
machine learning model after it has been learned as
values of the model.
shown in Figure 4. Evasion attacks typically involve an
arithmetic computation of an optimization problem. The
objective of the optimization problem is to compute a
tiny perturbation sigma which would cause an increase
in the loss function. The change in loss function would
then be significant enough to result in a wrong predic-
tion by the machine learning model. Evasion attacks
are classified as gradient-based attacks or gradient-free
attacks.
Gradient-based attacks are further classified based on Fig. 4. Evasion Attack
the frequency with which the adversarial samples are
updated or optimized. These are iterative or One- • Poisoning attacks, also known as causative attack,
shot attacks. Iterative attacks provide tighter control of involves adversarial corruption of the training data or
the perturbation in order to generate more convincing model logic during the training phase to induce a wrong
adversarial samples [60]. This however results in higher prediction from the machine learning mode as shown
computational costs. Alternative to iterative attacks are in Figure 5. Poisoning attacks may be carried out by
one-shot attacks which adopt a single-step approach data injection, data manipulation or logic corruption
without iterations. One-shot or one-time attacks are [60]. Data injection occurs when the adversary inserts
attacks in which the adversarial samples are optimized adversarial inputs to alter the data distribution while
just once. Iterative attacks, however, involve updating preserving the original input features and data labels.
the adversarial samples multiple times. By updating Data manipulation refers to a situation in which either
the adversarial samples multiple times, the samples the input features or data labels of the original training
are better optimized and perform better compared to data are modified by the adversary. Logic corruption is
one-shot attacks. However, iterative attacks cost more an attempt by the adversary to model structure.
computational time to generate.
Adversarial attacks against certain machine learning
techniques which are computationally intensive such as • Oracle attacks occur when an adversary leverages the
8
• Availability Attack results in a denial of service situ-
ation for the machine learning model. as a result, the
machine earning model becomes either totally unavail-
able to the user, or the quality is significantly degraded
to the extent that the machine learning system becomes
unusable to the end users.
5) Target: In our surveyed work, adversarial attacks are
Fig. 5. Poisoning Attack
targeted against a specific machine learning technique. Sev-
eral successful attempts have been made towards the trans-
access to the Application Programming Interface of ferability of adversarial attacks [72] [73]. However, attacks
a model, to create a substitute model with malicious that have been targeted towards a specific machine learning
intent. The substitute model typically preserves a signif- technique for example unsupervised learning, have not been
icant part of the functionality of the original model [62]. successfully transfered towards a another technique for ex-
As a result, the substitute model can then be used for ample supervised learning. Regarding the physical domain,
other types of attacks such as evasion attacks [60]. Or- it includes input sensors, cameras and output actions.
acle attacks can be further subdivided into Extraction, B. Adversarial Attack Methods and Algorithms
Inversion and Inference attacks. The objective of an We recall that adversarial attacks could be deployed either
extraction attack is to deduce model architectural details during decision time (evasion attacks) or during training time
such as parameters and weights from an observation of (poisoning attacks). In each case, the training algorithm (for
the model’s output predictions and class probabilities poisoning attacks) or the learned model (for evasion attacks)
[70]. Inversion attacks occurs when adversary attempts is being manipulated with some form of carefully crafted
to reconstruct the training data. An inference attacks input known as the adversarial samples. A common trend
allows the adversary to identify specific data points with among the attack methods below reveals that the robustness
the distribution of the training dataset [71]. of a machine learning model to a large extent depends on
4) Goal: Traditionally in the field of computer vision, the ability of an attacker to find an adversarial sample that
adversarial attacks are regarded in terms of targeted or is as close as possible to the original input. In this section,
reliability attacks [17]. In targeted attacks, the attacker has we evaluate the primary methods for generating adversarial
a specific goal with regard to the model decision. Most samples. It should be noted that recent research has shown
commonly, the attacker would aim to induce a definite pre- the limitations of some earlier methods that are still listed
diction from the machine learning model. On the other hand, here for reference even though more effective methods have
a reliability attack occurs when the attacker only seeks to been introduced.
maximize the prediction error of the machine learning model In the previous section IV-A, we described our threat
without necessarily inducing a specific outcome. Yevgeny et model for adversarial attacks in network security. In this
al. [14] have noted that the distinction between reliability section, we introduce a classification method for the various
and targeted attacks becomes blurred in attacks on binary adversarial attack algorithms. As seen in Figure 6 our classi-
classification tasks such as malware binary classification. fication method is based on the adversary strategy described
As such, these conventional paradigms of attacker goal in section IV-A.3.
classification is not optimal for consideration in network 1) Evasion Attacks: Evasion attacks attempt to mislead
security. We choose to adopt the CIA triad in this context the machine learning system during the testing or inference
and find that it is more suitable for adversarial classification phase. Below we highlight adversarial attack methods that
of the adversary goals in network security domain. fall within this category of evasion attacks. The attacks
• Confidentiality attack refers to the goal of the attacker to are further divided into Gradient-based and Gradient-free
intercept communication between two parties A and B, attacks.
to gain access to private information being exchanged. • Gradient-based attacks: Szegedy et al. [44] studied
This happens within the context of adversarial machine how adversarial samples could be generated against
learning, whereby machine learning techniques are be- neural networks for image classification. The L-BFGS
ing used to carry out network security tasks. (Limited Broyden-Fletcher- Goldfarb-Shanno) method
• Integrity attack seeks to cause a misclassification, dif- was then introduced, which used an expensive linear
ferent from the actual output class which the machine search method to find the optimal values of the ad-
learning model was trained to predict. Integrity attack versarial samples. In a different approach proposed by
could result in a targeted misclassification or a reliability Goodfellow et al. [1] called the Fast Gradient Sign
attack. A targeted misclassification attempts to make the Method (FGSM), adversarial samples are created by
machine learning model to produce a specific wrong finding the maximal direction of positive change in the
prediction. A reliability attack results in either a confi- loss. This is a faster method than the L-BFGS method
dence reduction or a misclassification to any arbritrary since only a one-step gradient update is performed
class apart from the correct class. along the direction of the sign gradient at each level.
9
Adversarial Attack Algorithms
10
overcomes and remedy the weaknesses of Projected adversarial samples can be used to attack any classifier,
Gradient Descent (PGD) [6] that lead to model robust- and they work with many transformations that exist
ness false outcomes. First PGD attack use fixed step defense methods may not be robust to such a massive
size with cross-entropy as a loss function that causes transformation. The adversarial patch leads the classifier
the failure as identity by [90]. In [83], they use a new to switch class labels to any target class. Chen et al. [92]
gradient-based scheme without step size selection with develop HopSkipJumpAttack based on a decision-based
different loss function. With these two changes, two attack that is a type pf black-box attack. This algorithm
versions of PGD produced with free parameters in the generates iterative targeted and untargeted adversarial
number of iteration. They also integrate the new PGD samples with minimum distance. This attack demon-
versions with FAB-attack [91] and Square attack [87] strates superior efficiency over various state-of-the-art
to produce a parameter-free attack called AutoAttack. decision-based attacks. The iteration in the algorithm
The authors also integrated two Auto Attack and were is based on gradient direction, step size, and boundary
tested on a large scale on 40 classifiers. search.
Sabour et al. [78] proposed a new adversarial image 2) Poisoning Attacks: A poisoning attack also known as
attack that not only focus on the class label but in the causative attack, uses direct or indirect means to alter the data
internal representations. The attack, known as Feature or the model. Poisoning attacks occurs either by injecting
Adversaries enables the possibility to deceive a trained false data, manipulating the original data, or corrupting the
DNN to mystify any source image with other target model logic.
image by finding a small perturbation from the source
• Data Injection: Biggio et al. [76] proposed a gradient
image that create similar internal representation to the
ascent based attack based on SVM that attacks the input
target image and not related to the source image. The
data that lead to maximize the non-convex surface error
authors however take into consideration that such adver-
and increase classifier classification at the test time. Gu
saries are not outliers. Universal Perturbation [84] was
et al. [77] proposed BadNets, which perform adversarial
proposed by Moosavi et al. as an algorithm to calculate
attacks by discovering the backdoored neural network
a universal small image perturbation to misclassify a
or BadNet. The attack is based on a full or partial
state-of-the-art deep neural network classifier. The main
outsourced training process where attacker provides the
focus of this algorithm was to find the perturbation
user with a trained model with a backdoor that causes
vector that deceives classifier on all data point samples.
a targeted misclassification and degrade in the accuracy
This fix perturbation is existed to lead changes in image
in some cases called backdoor trigger. For example, in
label gradually to build the universal perturbation.
autonomous driving, an attacker provides the user with
• Gradient-free Attacks: Decision Tree Attack was pro-
a street sign detector that is backdoored, which classify
posed by Papernot et al. [72] this type of black-box
stop sign well in most cases except when the stop signs
attacks use transferability of adversarial samples be-
have a particular sticker in classifying it as speed limit
tween and within different classifiers, including Deep
signs. This type of attack occurs under two scenarios
neural network. Logistic regression, decision trees, sup-
user outsource trained model or download a pre-trained
port vector machines (SVM), ensembles, and nearest
model.
neighbors. They demonstrated that black box attacks
• Data Manipulation: Feature Collision Attack proposed
are feasible to a machine learning algorithm that not
by Shafahi et al. [61] presents a watermarking poisoning
using deep neural networks and adversarial samples
attack based on optimization-based to craft a clean
works well between and across models using the same
label attack to target the behavior of a neural network
and different machine learning techniques. Chen et al.
classifier on a specific instance. This attack uses en-
[81] proposed an adversarial attack algorithm to attack
hanced preservation techniques to make it difficult to
DNN based on elastic-net regularization in feature L1
be detected.
and L2 called elastic-net attacks to DNNs (EAD).
EAD considers state-of-the-are L2 and Li nf inty Au- 3) Oracle Attacks: In an oracle type adversarial attack,
thors demonstrated that EAD could break undefended an adversary who has been given a oracle prediction access
and defensively distilled DNNs. They also improve to a model, steals a copy of a remotely deployed machine
the transferability of attacks and adversarial training. learning model. This enables the adversary to duplicate the
Shadow Attack was proposed by Ghiasi et al. [85] functionality of the model, i.e "steal the model" [62]. This
which is a new method for attacking systems that rely attack has become increasingly common due to the increase
on certificates and fool certified robust networks to in Machine Learning as a Service "MLaaS" offerings where
assign the wrong label to an image and produce a several companies that offer cloud-based Machine Learning
spoofed secure robustness certificate for the adversarial services e.g. Google, Amazon, and BigML, provide easy-to-
example. Adversarial Patch, proposed by Brown et al. use web APIs to manage client interaction.
[86] present universal, robust, and targeted adversarial • Inversion Attacks: Fredrikson et al. [71] exposed the
patches for the real world that do not require any privacy issues with providing access to machine learn-
knowledge about what image they are attacking. Those ing API. Their study demonstrated how an adversary
11
could utilize the confidence information of a model to In this section, we introduce a classification method for
result in model inversion attacks. The attack, which adversarial attacks in network security based on network
is implemented as a function called MI-Face attack, security task. Our classification approach considers the data
enables an adversary to extract pictures of subjects from object which is being manipulated by the adversary. The
a trained machine learning model. feature scope of the adversarial attack corresponds to the
• Inference Attacks: Fredrikson et al. [71] proposed the data object as shown in Figure 7.
attribute inference attack which could be launched either For the scope of this study, we consider adversarial attacks
as a white-box or black-box attack. based on the actual payload which is being attacked in
• Extraction Attacks: Correia-Silva et al. [74] demon- context. When a message is being transmitted from a sender
strated how an adversary could create a substitute model to a receiver, the payload represents the portion of the
from a black-box convolutional neural network (CNN) transmitted data that is actually the intended message. For
model by querying the black-box model with random example, when an email is sent, the payload consist of the
non-labeled data. A more intriguing aspect of this oracle message body, attachments, and URL links. Headers and
type of extraction attack is the fact that dataset used to metadata which help to facilitate the delivery of the payload
persuade the model was not related to original problem are not considered as part of the payload, within the context
domain. Orekondy et al. [75] proposed Knockoff Nets of our study. Hence, the protocol overhead is not considered
which are capable of stealing the functionality of a fully as part of the actual data.
trained model using a two-step approach. The adversary Our approach for classifying adversarial attacks in network
first obtains predictions from the model by querying a security is based off this approach, as shown in Figure 7.
set of input data, then the data-prediction pairs are used This is known as feature scope based classification, which
to create a substitute model known as a "knock-off" refers to what features are being manipulated or perturbed
model. Their approach uses a reinforcement learning by the adversary in other to generate an adversarial sam-
approach with demonstrated query efficiency and per- ple. Adversarial attacks against malware detection, phishing
formance gains, compared to other oracle type attacks. detection and spam detection applications try to perturb the
Jagielski et al [70] proposed the Functionally Equivalent payload features such as a binary file, a URL, or an email
Extraction (FEE) attacks which explore accuracy and message. These attacks are categorized as adversarial attacks
fidelity objectives within the space of model extraction against endpoint protection systems. Conversely, we also
by improving the query efficiency of learning attacks. have adversarial attacks against network anomaly detection
Their method is demonstrated to be practical for high applications and these type of attacks will seek to perturb
parameter models in the range of millions. In their protocol features such as the network metadata or protocol
attack method, an adversarial model is produced whose headers. We categorize these attacks as adversarial attacks
architecture and weights are identical to the oracle. against network protection systems.
Network security domain that utilize machine learning
V. A DVERSARIAL ATTACK C LASSIFICATION techniques fall into four broad categories namely malware
detection, phishing detection, spam detection and network
Multiple studies [93] [94] have sought to differentiate anomaly detection. We illustrate this categorization in Figure
the different domains of network security into multiple 7. The first three categories of network security tasks are
fragmented domains. A common approach for example make considered as endpoint based protection. Machine learning
attempts at differentiating malware and spam detection from applications within this endpoint based protection category
intrusion detection [9]. We find that this attempt of fine are typically initiated with payload features. Network protec-
grained classification results in redundancy, since the task of tion primarily constitutes network anomaly detection and ma-
malware or phishing detection in a network could be consid- chine learning applications within this category are typically
ered an intrusion detection task. As such, in this survey, we initiated with protocol features. Our study only considers
consider cyber attacks against a network as an attempt by an active attacks against a network, and passive attacks such as
adversary to intrude the network with a malicious payload. eavesdropping are not within the scope of this study. Ad-
We identify malicious payload in a network to consist of versarial attacks hence seek to generate adversarial samples
three broad types: malicious files (malware), malicious text using specific data objects.
(spam) and malicious url links (phishing). We note that In contrast to adversarial attacks in the field of image
attackers may use a combination of all three payloads in most processing or computer vision, network security’s adversarial
cyber attacks. For example, a spam email may also contain a learning is more challenging. This occurs because even very
link to a malicious url or contain a malicious file attachment. slight modifications to URLs, spam, packets, or malware
This payload approach becomes even more crucial in our bytes of the binary files can significantly alter the function-
study on adversarial attacks within the network security ality of the data. In computer vision, the addition of tiny
domain. We realise from our study that this distinction plays perturbations to an image sample does not alter the human
an important role in providing an accurate classification of perception of the image and same as in speech processing.
adversarial attacks within the network security domain, as Text processing and network security filtering techniques are
compared to other domains such as computer vision. similar in this regard since a very slight change in the input
12
3) Texture Perturbation Attacks: Researchers have de-
ployed visualization techniques similar to computer vision
and adapted it for malware classification [98]. This involves
conversion of malware binary code into image data. The
Adversarial Texture Malware Perturbation Attack (ATMPA)
achieved a 100 percent effectiveness in defeating visualiza-
tion based machine learning malware detection system and
also resulted in 88.7 percent transfer-ability rate [16]. The
attack model for ATMPA works by allowing the attacker
to distort the malware image data during the visualization
process.
4) Android malware attack in Problem space: [58] et al
formalized an approach for problem space adversarial eva-
sion attacks against machine learning based android malware
detection systems. Their study identified four main contraints
Fig. 7. Adversarial attack classification which are characteristic of any problem space attack. Their
study adopted a technique which automates the generation
thousands of realistic and inconspicuous adversarial malware
such as a word or a byte will alter the meaning of the text samples, further buttressing the notion of adversarial mal-
or the data functionality. Hence, approaches for generating ware as a service as a real threat in network security. Their
adversarial samples in the domain of machine learning-based attack led to a misclassification rate of 100.0 percent on the
network security filtering systems need to occur in such a successfully generated samples.
way that the malicious functionality is not distorted. Several 5) EvadeDroid: Bostani et al. [99] presented EvadeDroid,
approaches for achieving these adversarial attacks have been another problem space Android evasion attack. EvadeDroid
researched and are discussed in the sections below. is a query-efficient black-box attack, that can fool ML-based
Android malware detectors without altering the functionality
A. Adversarial attacks against Malware Detection
of the original malware samples. It uses an n-gram-based
A major component of endpoint protection in network similarity method to select candidate donors for gadget
security is malware detection. Yet, malware detection re- extraction to change malware samples into benign ones
mains a challenging problem in network security. Between through an iterative and incremental manipulation technique.
2009 and 2019, the number of new malware digital signa- Their experimental results demonstrated that EvadeDroid’s
tures has increased by over 2000 percent [95]. Therefore, evasion rates are 81, 73, 75, and 79 percent for DREBIN,
traditional malware detection systems that rely solely on Sec-SVM, MaMaDroid, and ADE-MA, respectively.
digital signatures have become less effective. Significant 6) EvnAttack: EvnAttack is an evasion attack model that
effort has been made in the use of machine learning to was proposed in [47] which manipulates an optimal portion
protect against malware attacks. Several researches have of the features of a malware executable file in a bi-directional
shown the vulnerability of these machine learning models way such that the malware is able to evade detection from
to adversarial attacks. The most common approach is the a machine learning model based on the observation that
addition of selected sequence of bytes to the binary file. the API calls differently contribute to the classification
Several approaches have been considered for synthesizing of malware and benign files. The detection model’s false
this sequence of bytes as discussed below. negative ratio almost reached 1 (100 percent), which means
Malware detection may be based on static analysis, in almost all malware samples are misclassified.
which the malware is detected without executing the code. 7) AdvAttack: AdvAttack was proposed in [45] as a novel
Alternatively, dynamic analysis for malware detection typi- attack method to evade detection with the adversarial cost as
cally executes a suspicious malware sample in a sandbox in low as possible. This is achieved by manipulating the API
an attempt to discover dynamic behavioural patterns such as calls by injecting more of those features which are most
API call sequences. relevant to benign files and removing those features with
1) Iagodroid: One of the earliest attacks against machine higher relevance scores to malware. AdvAttack increased the
learning based malware detection systems was the Iagodroid classifier’s false negative ratio to 71 percent while degrade
attack [96]. Iagodroid uses a method to induce mislabelling the accuracy of the classifier to 58.5 percent.
of malware families during the triaging process of malware 8) MalGAN: To combat the limitations of traditional
samples. Their evasion rate reached 97 percent. gradient-based adversarial sample generation, the use of a
2) Stingray: Suciu et al [97] proposed an adversarial generative adversarial network (GAN) based algorithm for
attack against malware using the ’FAIL’ model. Their study generating adversarial samples has been proposed. Genera-
focuses on constraints of obscurity and transferability in tive models have been mostly used for input reconstruction
order to realize a targeted poisoning attack. StingRay suc- by encoding an original image into a lower-dimensional
ceeded in half of the test cases. latent representation [2]. The latent representation of the
13
original input can be used to distort the initial input to is perturbed and appended to the original malware binary
create an adversarial sample. MalGAN proposed by [100] file instead of perturbing the original binary file. Thus by
leverages on generative modeling techniques to evade black- adding perturbations in the embedding vector space and
box malware detection systems with a detection rate close reconstructing new binary files from the adversarial example.
to zero. This attack’s evasion rate reached 100 percent.
9) GAPGAN: Yuan et al. [101] introduced GAPGAN, 14) Adversarial-Example Attacks Toward Android Mal-
an adversarial attack framework that generates adversarial ware Detection System: . MalGAN [100] proposed a black-
examples against binaries-based malware detection through box adversarial-example attacks toward Android malware
GANs. Adversarial perturbations are appended to the original detection, in which adversarial examples are generated using
malware binaries to maintain its malicious functionality. a generative adversarial network (GAN) without requiring
They tested GAPGAN on deep learning and MalConv de- the knowledge about the target. Unfortunately, the effec-
tectors. GAPGAN’s success rate reached 100 percent attack tiveness of Malgan is affected, if a firewall is incorporated
with appending payloads of 2.5 percent of the total length into the malware detection system. Adversarial attacks were
of the original data. also studied against cloud-based Android malware detection
10) Black-Box Attacks against RNN Based Malware De- systems. Li et al. proposed a bi-objective GAN type adversar-
tection Algorithms: Hu et al. [55] implemented a generative ial attack against android malware detection systems. Their
recurrent neural network (RNN) which generates sequential technique has the novelty of implementing a GAN with two
adversarial samples. In their study, the Gumbel-Softmax discriminators in which one discriminator contends against
approach is used to approximate generated discrete API’s. the firewall while the other discriminator contends against
Before their attack, the victim’s RNN malware detection rates the malware detector. This study was the first study to target
ranged from 90.74 to 93.87 percent. After their adversarial a firewall-equipped Android malware detection system.
attack, the detection rates on adversarial examples ranged 15) Adversarial malware sample generation method
from 0.44 to 3.03 percent. based on the prototype of deep learning detector: Qiaoa
11) Adversarial Deep Learning for Robust Detection of et al.[106] presented a method for generating adversarial
Binary Encoded Malware: Al-Dujaili et al [102] proposed malware to fool the deep learning-based malware detection
a method of generating adversarial malware samples with a systems. The post-hoc interpretability of deep learning is
focus on preserving the malicious functionality of the binary used by the authors to direct the malware file’s updates.
encoded files. They also introduce a mitigation framework Based on their experiments, the time to generate their ad-
known as SLEIPNIR which employs the saddle-point opti- versarial malware is less than other attacks. The fooling rate
mization technique to learn malware detection models. of this attack reached 92 percent.
12) Deceiving End-to-End Deep Learning Malware De- 16) Slack Attacks: A byte-based convolutional neural
tectors using Adversarial Examples: The authors Kreuk et network (MalConv) was introduced by Raff et al. [107].
al. [103] introduced a novel approach for creating adversarial Unlike image perturbation attacks [44], where the fidelity of
malware samples by injecting a small sequence of bytes the image is of little concern, attacks that alter the binaries
to the binary file. The approach was also found to be of malware files must maintain the semantic fidelity of
transferable across different malware files and families. In the original file because altering the bytes of the malware
their study, they evaluated the effectiveness of adversarial arbitrarily could affect the malicious effect of the malware.
malware samples based on five metrics namely (1) File This problem could be solved by appending adversarial noise
transferability, (2) Spatial Invariance (3) payload size, (4) to the end of the binary [48]. This prevents the added
entropy (5) Functionality preservation. Their study was based noise from affecting the malware functionality. The Random
on only white box attacks and was not evaluated as white box Append attack and Gradient Append attacks are two types of
scenarios. Their injection procedure resulted in an evasion append attacks which work by appending byte values from
rate of 99.21 and 98.83 percent. a uniform distribution sample and gradually modifying the
13) Adversarial Examples on Discrete Sequences for appended byte values using the input gradient value. Two
Beating Whole-Binary Malware Detection: The authors additional variations of append attacks; the benign append
[104] focus on adversarial attacks against Convolutional and the FGM Append were introduced by Suciu et al. [108]
Neural Network (CNN) based end to end malware detectors. which improves the long convergence time experienced in
End to end malware detectors such as Malconv [105] func- previous attacks. When malware binaries have exceeded the
tion quite different from most deep learning based malware model’s maximum size, it is impossible to append additional
detectors in the sense that they take the whole malware bytes to them. Hence a slack attack proposed by Suciu et
binary file as an input. To achieve their aim, a loss function al. [108] exploits the existing bytes of the malware binaries.
was which functions as a surrogate loss function proposed The most common form of the slack attack is the Slack FGM
which enforces the modifications in the embedding space. Attack which defines a set of slack bytes that can be freely
Thus, the authors were able to modify the embedding vector modified without breaking the malware functionality.
in order to reconstruct the modified binary, which becomes 17) Attack and Defense of Dynamic Analysis-Based, Ad-
the adversarial malware sample. To preserve the functionality versarial Neural Malware Detection Models: Stokes et. al
of the malware binary, a unique section of payload bytes [109] proposed adversarial attacks against dynamic analysis-
14
based malware detection systems. Their work focuses on 4) Attacks against crowd-turfing detection systems: Ma-
different strategies of crafting adversarial samples for deep chine learning techniques are used to identify misbehavior
learning based dynamic analysis of malware samples. Their includes fake users in social networks and detect users who
study is motivated in the fact that static analysis based pays for sites to have fake accounts. Malicious crowdsourc-
deep learning malware classifiers only classify the content ing or crowd-turfing systems are used to connect users who
of the unknown file without execution, and become less are willing to pay, with workers who carry out malicious
effective when faced with packed or encrypted malware activities such as generation and distribution of fake news,
files. In addition, they propose a defense mechanism known or malicious political campaigns. Machine learning models
as the weight defense mechanism. The compare their de- have been used to detect crowdturfing activity with up to
fence technique to existing defenses such as distillation and 95 percent accuracy particularly in detecting the accounts
ensemble defenses. They however did not compare their of crowdturfing workers [115]. However, malicious crowd-
study to the more popular approach of adversarial training, sourcing detection systems are highly vulnerable to adver-
which is a proven method for reducing the vulnerability sarial evasion and poisoning attacks.
deep learning classifiers to adversarial samples. Their study 5) Attacks Against ML for Keystroke Dynamics: Negi et
also indicates that adding more hidden layers to the neural al. [116] created adversarial keystroke samples that misled
network significantly improves the robustness of the deep an otherwise accurate classifier into accepting the artificially
learning based malware classifier to adversarial samples. generated keystroke samples as belonging to an authentic
B. Adversarial attacks on Spam Detection user. Almost 50 percent of the tested users were compro-
mised after their attack.
Spam detection is a significant endpoint protection com-
6) Attacks against ML for credit card fraud detection:
ponent, used to protect users from unsolicited digital com-
Zeager et al. [117] examined how a logistic regression clas-
munications. Machine learning techniques are widely used
sifier used as a fraud detection mechanism, could be adver-
for current spam filtering applications, most of which utilize
sarially attacked to cause a number of fraudulent transactions
supervised learning methods [110]. Multiple adversarial at-
to go undetected. Previous studies have similar models which
tacks on machine learning-based spam detection systems are
are based on game theory to investigate adversarial attacks
discussed below.
against credit card fraud detection and email spam detectors.
1) Adversarial classification: Dalvi et al [111] were the
However, the authors introduced a new framework which
first to introduce a formal framework with corresponding
successfully produced an improved AUC score on multiple
algorithms to describe the problem of adversarial attacks
iterations of the validation sets compared to the performance
against machine learning based spam detectors. In their
of the models which credit card companies had previously
study, they seek the minimum cost camouflage (MCC) of
used.
a data sample x to generate an adversarial sample MCC(x)
with the minimum cost, for which the classifier outputs 7) Crafting Adversarial Email Content against Machine
a negative sample. Similar studies [112] had considered Learning Based Spam Email Detection: Wang et al. [118]
adversarial attacks against spam detectors albeit not machine proposed two methods to create adversarial email content
learning based. to bypass spam detectors. The first approach approximates
2) Attacks on Statistical Spam filters: Several spam filters the Term Frequency–Inverse Document Frequency) TF-IDF
such as SpamAssasin, SpamBayes, Bogofilter are based on values in the resultant adversarial examples and the second
the popular Naive Bayes Machine learning algorithm which method recognizes and adds a group of significant words
was first applied to filtering junk email in 1998 [113]. A to fool the detectors. They tested their work on multiple
variety of good word attacks introduced by Lowd [112] machine language models like; KNN, SVM, decision tree,
were successfully evading the machine learning models from and logistic regression, in both white-box and black-box
detecting spam or junk emails. Using these attacks, an attack scenarios. Their attacks’ success rates ranged from
attacker can get 50 percent of currently blocked spam past 2.2 to 98.9 percent, which is inconclusive. However, they
a typical spam filter. concluded that the second method is more effective.
3) Exploiting Machine Learning to Subvert Your Spam 8) Marginal Attacks of Generating Adversarial Exam-
Filter: Nelson et al. [114] showed in 2008 that an at- ples for Spam Filtering: Zhaiquan et al. [119] created the
tacker could effectively disable the SpamBayes spam filter marginal attack, which generates adversarial samples that can
with small information and little control over training data. deceive naive bayesian spam filters by selecting sensitive
Their introduced Usenet dictionary poisoning attack caused words from a sentence and then add them at the end of
misclassification of 36 percent of ham messages with only the sentence. Their experiments showed that adding just one
1 percent control over the training data. They have also word to the message could reduce the model’s accuracy from
presented a new class of focused attacks that stop victims 93.6 to 55.8 percent. They also tested the transferability of
from receiving specific email messages. With knowledge of the generated adversarial samples against standard machine
only 30 percent of the target’s tokens, their focused attack learning filters like logic regression, decision tree, and linear
altered the classification of the target email 60 percent of the support vector. In some cases, the accuracy of these filters
time. could drop from 100 to 1.5 percent.
15
9) Universal Adversarial Perturbations and Image Spam and appearance of the samples. Their attack’s success rate
Classifiers: Phung et al.[120] evaluated numerous adversar- varied depending on the knowledge and the attacked model.
ial attack methods against deep learning-based image spam Attacks on Google’s phishing page filter achieved a 100
classifiers, and they found that the universal perturbation percent attack success rate. Their transferability attack on
method is the most harmful. So they used this approach to BitDe-fender’s industrial phishing page classifier, Traffi-
create a novel transformation-based adversarial attack that cLight, achieved 81.25 and 50 percent transferability attack
was capable of creating tailored “natural perturbations” in rates in the black- and gray-box scenarios.
image spam. In some cases, their suggested attack can lower
the model’s accuracy to reach 23.7 percent. D. Adversarial attacks against Network Anomaly Detection
Network anomaly detection devices learn network activity
C. Adversarial attacks against Phishing Detection patterns and detect irregularities. They must continuously
Phishing detection is a critical endpoint protection element scan the network, analyze encrypted data, and spot anomalies
aimed to save the users from serious fraudulent actions like; in real-time. Machine learning ticks all these boxes, that’s
money stealing and accessing private information. There why it is used extensively in modern Network anomaly
are multiple techniques for phishing detection like [121]; detection tools, however, researches have found some ways
List-base approach, Visual similarity-base approach, and to attack them. Multiple of these adversarial attacks are
Heuristics and machine learning-based approach, which is discussed below.
the most popular method now. Several adversarial attacks 1) IDSGAN: IDSGAN was proposed by Lin et al. [126]
on machine learning-based phishing detection systems are for generating adversarial attacks targeted towards intrusion
discussed below. detection systems. IDSGAN is based on the Wasserstein
1) FIGA: Gressel et al.[122] proposed the Feature Im- GAN [127] which uses a generator, discriminator and a
portance Guided Attack (FIGA) to fool phishing detection black-box. The discriminator is used to imitate the black-box
models by perturbing the most effective features of the intrusion detection system and at the same time provide the
input in the direction of the target class. It is a model- malicious traffic samples. IDSGAN can lower the detection
agnostic gray-box attack that needs knowledge of the feature rates of some IDS models to approximately zero percent.
representation of the victim model. FIGA was tested on eight 2) TCP Obfuscation Techniques: Another method for
different phishing detection models, and it reduced the F1- evading machine learning based intrusion detection systems
score of the models from 0.96 to 0.41 on average. is the use of obfuscation techniques. Homolial et al. [128]
2) Bypassing Detection of URL-based Phishing Attacks proposed the modification of various properties of network
Using Generative Adversarial Deep Neural Networks: AlEr- connections to obfuscate a TCP communication which suc-
oud et al. [123] presented an evasion technique that attacks cessfully evades a wide variety of intrusion detection classi-
URL phishing detection systems via Generative Adversar- fiers.
ial Networks (GAN). Their generated samples can deceive 3) Deep Adversarial Learning in Intrusion Detection: A
Blackbox phishing detectors even when those detectors are Data Augmentation Enhanced Framework: Zhang el al.
created using refined methods like those relying on intra- [129] proposed a framework which incorporates deep ad-
URL similarities. Their experiments revealed that some versarial learning with statistical learning in a manner which
classifiers were unable to identify any of the adversarial exploits learning-based data-augmentation. In the study, the
examples leading to zero true positive rates. At the same Poisson-Gamma joint probabilistic generative model is used
time, the false positive rates are increased, which indicates to synthesize adversarial samples.
the percentage of benign examples classified as phishing. 4) Generative Adversarial Networks For Launching and
3) Generating Optimal Attack Paths in Generative Adver- Thwarting Adversarial Attacks on Network Intrusion Detec-
sarial Phishing: Al-Qurashi et al. [124] proposed a method tion Systems: A Generative adversarial network (GAN) -
that creates adversarial phishing attacks by discovering op- based adversarial attack was proposed by Usama et al. [65].
timal subsets of features that lead to a higher evasion rate. Their method was the first attempt to utilize GAN-based
To achieve this, multiple feature engineering techniques are adversarial attacks against a black box Intrusion detection
used, such as Recursive Feature Elimination, Lasso, and system (IDS) while still preserving the functional behavior
Cancel Out. Their experiments revealed that their attack has of the network traffic. In some cases, their attack dropped the
better evasion capability than Generative Adversarial Deep accuracy of the detection model from 84.3 to 43.4 percent.
Neural Network (GAN) which randomly perturbs features. 5) Adversarial deep learning for robust detection of bi-
4) Advanced evasion attacks and mitigations on practical nary encoded malware: Al et al. [102], developed four
ML-based phishing website classifiers: Song et al. [125] adversarial attack methods to generate an adversarial exam-
introduced multiple mutation-based techniques, differing in ple of a binary malware file that preserves its functionality
the knowledge of the target classifier (white, gray, and black (rFGSM, dFGSM, BCA, and BGA). They developed a
boxes). They also proposed a sample-based collision attack framework for training robust malware detection models by
to acquire the knowledge of the target model, in the cases utilizing the saddle-point formulation that consists of the
of white- and gray-box scenarios. Their evasion attacks inner maximization and outer maximization problems. The
fooled the classifiers without changing the functionalities inner maximization approach is used to generate powerful
16
adversarial examples that maximize the loss, and then they transferable as it was not proven to be generalizable across
inject them in the training time. In some conditions, their multiple adversarial attack scenarios.
attack’s evasion rate exceeded 99 percent. Several uses of deep learning for anomaly detection
6) Investigating Adversarial Attacks against Network In- in wireless communication systems have been commonly
trusion Detection Systems in SDNs: With the increasing implemented including channel decoding, [132], wireless
deployment of ML-based NIDSs which leverage the global resource allocation [133] [134] and radio signal (modulation)
network visibility offered by SDNs, the threat of vulnera- classification [135]. Uses of Machine Learning in IoT include
bility of the ML algorithms to adversarial attacks is also anomaly detection [136], device identification [137] [138],
considered. Their study considered a use-case example of a and signal authentication [139].
SYN Flood DDoS attack, in which they demonstrated the 8) Adversarial Attacks on Deep-Learning Based Radio
ability to reduce the NIDS detection accuracy from 100% to Signal Classification: The robustness of deep learning based
0% on multiple classifiers using evasion attacks. This was algorithms for the wireless physical layer was also stud-
one of the most successful attempts of adversarial attacks ied within the context of radio radio signal (modulation)
against Network Intrusion Detections Systems, proposed by classification tasks. Sadeghi [140] investigated the use of
Aiken et al [130]. Their experimental platform was based convolutional neural networks in which they developed both
on ML based NIDS for Software defined networks called white-box and blackbox adversarial attacks for a DL based
Neptune. In their study, they demonstrated that with the modulation classification. In their study, a VT-CNN was
perturbation of a few features, the detection accuracy of a used as the classifier. The outcome of their research showed
specific SYN flood Distributed Denial of Service (DDoS) that Significantly less transmit power is required by the
attack by Neptune decreases from 100% to 0% across a attacker in order to cause misclassification in the case of
number of classifiers. Furthermore, they proposed an ad- adversarial machine learning, as compared to the case of
versarial test suite named Hydra to evaluate the impact conventional jamming (where the attacker transmits only
of adversarial evasion classifiers against an anomaly-based random noise). Hence, adversarial machine learning is an
NIDS - Neptune. Their study considered several classifiers alternative to signal jamming with random noise, with less
and machine learning algorithms, proving that clustering al- resource required in terms of transmit power. Their research
gorithms were more robust to adversarial samples compared also created a a computational efficient algorithm for crafting
to other ML types. Specifically, KNN proved to be the most universal adversarial perturbations (UAP), which can cause
robust classifier against the adversarial attacks performed a misclasification of the deep learning model irrespective of
within their research, with only one combination of feature the input provided to the model. Furthermore, their study
perturbations halving the detection accuracy from 100% to revealed an interesting property known as the Shift invariant
50%. In contrast, Random forest, LR, and Support vector Property of their attack method, which makes the attack
machines were generally vulnerable to the same perturba- generalizable across various deep learning models, without
tions resulting in similar detection accuracy reductions. The having any knowledge of the nature of the model, thus
concept of attack generalization was also studied in this implying a black-box attack. Their tests showed that after
publication, using their Neptune NIDS framework as the applying these attacks, the targeted model accuracy could
adversarial target and which was capable of implementing drop from 75 to 0 percent in the cases of a high perturbation-
multiple classifiers. to-noise ratio (ratio of the perturbation power to the noise
7) IoT Network Security from the Perspective of Adver- power).
sarial Deep Learning: . The effect of adversarial attacks 9) Addressing Adversarial Attacks Against Security Sys-
on wireless sensor networks was studied by Sagduyu et. tems Based on Machine Learning: Apruzzese et al. [141]
al. [131]. The study experimented with adversarial attacks proposed an attack and defense method against several
within the context of three types of over-the-air (OTA) wire- types machine learning algorithms in for network intrusion
less attacks, namely within the jamming, spectrum poisoning, detection systems. In their study, they evaluated both poison-
and priority violation attack. Their study demonstrated how ing and evasive adversarial attacks against three supervised
adversarial attacks can lead to significant loss in throughput, machine learning algorithms. The three algorithms namely
by fooling an IoT transmitter into making a wrong trans- Random forest, K-nearest neighbour and Artificial Neural
mit decision in the test phase. This was also an evasion Network (multi-layer perceptron) MLP were used to develop
attack against the machine learning model. In their study, a network intrusion detection system. Their poisoning and
they considered an IoT network where an IoT transmitter evasion attack severity averaged 70.1 and 66.4 percent,
predicts if a channel status is idle or busy, by using deep respectively. They also demonstrated that adversarial training
learning algorithms. Their study showed that deep learning was effective in improving the robustness of deep learning
was effective in performing this task. Then, adversarial based network intrusion detection systems.
machine learning as applied in three contexts - jamming, 10) Adversarial Deep Learning for Cognitive Radio Secu-
spectrum poisoning and priority violation attakcs. A defense rity: Jamming Attack and Defense Strategies: Shi et al. [142]
system based on stackelberg game showed to be an effective proposed an adversarial machine learning approach to launch
mitigation against adversarial machine learning against ioT jamming attacks on wireless communications and introduces
networks. This defense technique is however considered not a defense strategy. The study bases on the premise that in
17
a cognitive radio network, a typical transmitter workflow attacks ( MI-FGSM, L-BFGS, PGD, and SPSA) using NSD-
includes the task of sensing available channels, identifying KDD dataset. They compare different well-known models,
spectrum opportunities, and then transmitting data to the including SVM, RF, and LR, with the proposed framework
receiver in idle channels. As machine learning techniques under adversarial attacks. They use different metrics to
have been progressively applied in this context, such as im- compare the model robustness, including accuracy(ACC),
plementing a deep learning classifier for the classification of Precision Rate (PR), Recall Rate (RR), F-Sorce (FS), and
channels as either idle or busy, attackers seek to compromise Success Rate (SR).
the machine learning classifier. Even though the attacker 16) Analyzing adversarial attacks against deep learning
has no knowledge of the deep learning classifier, i.e this for intrusion detection in IoT networks: Ibitoye et al. [145]
is a black box attack. Their experiments showed that their studied the adversarial samples effectiveness against deep
adversarial deep learning attack reduced the transmission learning-based Intrusion Detection System (IDS) within the
success rate from 73.79 to 2.91 percent. The authors also context of an IoT network. The authors provide a com-
propose a defense technique for the deep learning classifier prehensive comparison between two different deep learning
that works by allowing the transmitter to deliberately takes model, a Self-normalizing Neural Network (SNN) and a
wrong actions in predetermined time slots in order to mislead Feed-forward Neural Network (FNN). They utilize and study
the adversary. input features normalization in a deep learning-based IDS in
11) Performance Evaluation of Physical Attacks against an adversarial environment. It increases the robustness of
E2E Autoencoder over Rayleigh Fading Channel: Albaseer the deep learning model against various adversarial attacks
et. al [67] investigated the vulnerabilities of autoencoder (FGSM, BIM, and PGD).
E2E with Rayleigh channel mode. Their study demonstrated 17) Online anomaly detection under adversarial impact:
the vulnerability of auntoencoder deep learning models to Kloft et al. [146] studied the effect of a poisoning attack
adversarial samples when used in end-to-end wireless com- of training data on online centroid anomaly detection (IDS)
munication systems. Both white-box and black box attacks with a finite sliding window. They study the poising attack
were launched against and e2e model that was based on a with limited and full control of the training dataset using
realistic channel model. Their results showed that adversarial real HTTP traffic from a web server of Fraunhofer FIRST
attacks had more significant impacts compared to jamming institute. This study shows if the attacker has full control
attack. of the data, is it easy to attack while when applying ad-
12) Physical adversarial attacks against end-to-end au- ditional constraints to have limited control of the training
toencoder communication systems: Sadeghi et al. [68] also data by assuming that attacker can inject a small fraction
showed that end to end learning of wireless communica- of the training dataset, the attack fails. Therefore, adding
tion systems are vulnerable to physical adversarial attacks. those constraints adds protection approaches against poising
Similar to the work of Albaseer et al. [67], their study attacks. Their results show that they cannot consider their
demonstrates that adversarial attacks are more destructive method secure if the attacker has full control of the dataset.
than jamming attacks. 18) Security evaluation of pattern classifiers under attack:
13) Targeted Adversarial Examples Against RF Deep Biggio et al. [147] proposed a framework for empirical
Classifiers: Kokalj-Filipovic et al. [143] studied the effect security evaluation that can be applied in different three
of adversarial samples on machine learning based classifiers real-life applications, including Intrusion detection system,
for radio frequency signals. The goal of their research was to spam filtering, Biometric Authentication. They proposed an
verify if adversarial samples against machine learning based algorithm to sample training and testing sets. They evaluate
classification in of radio frequency signals was as effects in their framework performance under causative adversarial
the physical world (i.e when launched over the air - OTA) attack using SVM and LR algorithm. For IDS, they used a
as it was in theoretical settings. public data set of a web-server with 205 malicious samples
14) Deep Learning-Based Intrusion Detection With Ad- collected in five days in 2006. Authors recommend the
versaries: Wang et al. [15] evaluated the vulnerabilities of designer of classifiers to follow to use their framework to
deep learning-based IDS among state-of-the-art adversarial evaluate the security of the classifier.
attack algorithms, including FGSM, JSMA, Deepfool, and 19) Evading Machine Learning Botnet Detection Models
CW using NSL-KDD dataset. They recognize feature pat- via Deep Reinforcement Learning: Wu et al. [148] intro-
terns for the attack algorithms, and they demonstrated that duced a generic black-box attack against botnet detection
modifying a limited number of features is better for most machine learning models. The authors of this paper use
of the adversaries, such as JSMA attacks. JSMA attacks deep reinforcement learning (DRL) to generate adversarial
distinguish adversaries in terms of applicability. They noticed traffic flows to deceive the detection models. A reinforcement
how feature selection to be perturbed by an adversary varies learning agent updates the adversarial samples to change the
depending on the degree of significance. temporal and spatial features of the traffic flows without
15) Evaluating Deep Learning-Based Network Intrusion altering the original functionality and executability. Their
Detection System in Adversarial Environment: Peng et al. attack’s evasion rate ranged from 69.3 to 80.4 percent.
[144] evaluated the developed scalable ENIDS framework 20) Attack-GAN: Cheng et al. [149] proposed Attack-
robustness in the adversarial environment against various GAN to generate malicious adversarial raw packets that can
18
mislead current machine learning network intrusion detection both adversarial risk and obscurity have been impossible
systems in the internet of things. Each byte in a packet is to compute directly [152], frameworks for adversarial risk
represented with word embedding. Feedback from the victim based on the concept of obscurity have been proposed [153].
NIDS is needed by this black box attack to update the
parameters of the generator. The attack success rate depends B. Adversarial Risk Grid Map
on multiple factors like the machine model and the modes A modified notion of adversarial risk was proposed in
of byte embedding, but it reached 98.42 percent in the best [154] which suggested that certain classifiers inherently have
case. low adversarial risk. Other works [155] [156] have suggested
21) Fooling intrusion detection systems using adversar- a trade-off between standard risks and adversarial risk. This
ially autoencoder: Chen et al. [150] introduced AIDAE indicates that with increase in standard accuracy of the
(Anti-Intrusion Detection AutoEncoder) framework against classifier, the adversarial risk of the classifier increases.
IDSs. AIDAE can produce features matching normal fea- Based on our review, a grid map based on the autonomy
ture distribution, it also keeps the correlation between the of the machine learning model is proposed. We term this as
generated continuous and discrete features. They used Eva- model autonomy adversarial risk approach since it is based
sion Increase Rate (EIR) to evaluate their attack. The EIR on the directive and discriminative autonomy of the machine
reflects the evasion power by comparing the adversarial learning models. The map is shown in Figure 8.
detection rate with the original, i.e. 1-(adversarial detection • Discriminative Autonomy: The discriminative autonomy
rate/original detection rate). EIR was higher than 0.9 in all is directly related to the type of task being performed
their experiments. by the machine learning model. Machine learning tasks
22) TANTRA: Sharon et al. [151] presented TANTRA such as classification are highly dependent on the input
(Timing-Based Adversarial Network Traffic Reshaping) data. As such, they have lower discriminative or condi-
which deceives NIDSs by reshaping attack network traffic tional autonomy compared to tasks such as generative
using the timestamp attribute. Based on the authors’ evalu- modeling which depend less on the input data when
ation, TANTRA had an extremely high success rate (99.99 predicting an outcome.
percent). However, when TANTRA was tested after training • Directive autonomy: The directive autonomy of a ma-
the NIDSs with both benign and reshaped traffic, its success chine learning model is a function of the machine learn-
rate decreased. ing technique. In supervised machine learning, there is
less directive autonomy since the model needs to be first
VI. E VALUATING A DVERSARIAL R ISK
learned with some form of labeled data. Machine learn-
In discussing adversarial risk, we introduce the concept ing techniques such as reinforcement learning depend
of discriminative and directive autonomy of machine learn- less on a model being learned with any form of training
ing models. The two-fold goal of an adversarial risk grid data and posses much higher directive autonomy.
mapping is to evaluate the likelihood of success of an
adversarial attack against a machine learning model, and C. Cross Model vs Cross Dataset Attack
the consequence of that attack if successful. Adversarial risk In discussing adversarial risk, the notion of transferability
often seek to measure the performance of a machine learning becomes pertinent. Transferability refers to the fact in which
model based on worst case inputs [152]. We present in this an adversarial example which is crafted for a specific deep
paper, an adversarial risk grid map shown in Figure 8 based learning model, is found to be effective in causing a mis-
on the level of autonomy of the machine learning model classification in a different model. This is known as cross-
with respect to the learning technique and task. The concept model adversarial samples. in a similar situation, when the
of discriminative autonomy and directive autonomy of the adversarial sample that was generated by altering a particular
machine learning models represents a novel approach for dataset. If that sample is used to attack a deep learning
evaluating the relative adversarial risk of a machine learning system that was trained using a different dataset, that is called
model. a Cross-dataset adversarial sample.
A. Security by Obscurity in Adversarial Risk VII. D EFENDING AGAINST A DVERSARIAL ATTACKS
The notion of security by obscurity in adversarial context, Numerous researchers have aimed to review and classify
in which defenses are proposed based on obscurity to an defenses against adversarial attacks. Barreno et al. [8] first
adversary does not truly reflect the nature of adversarial proposed three broad approaches for defending machine
risk in machine learning-based network security applications. learning algorithms against adversarial attacks. Regulariza-
The prevalence of black box adversarial attacks which fool tion, Randomization, and Information hiding. Yuan et al. [88]
classifiers without having direct access to the model further classified the defenses into two broad strategies. Proactive
demonstrate the weakness in the obscurity approach to strategies and reactive strategies. Rosenberg et al. [9] orga-
adversarial risk. nized the defenses based on the cyber security sub-domains
As adversarial attacks continue to emerge into real world (malware detection, spam detection, biometric systems, etc.),
production systems, the ability to computationally evaluate in our work, we classify the defenses based on generalized
and even optimize adversarial risk becomes invaluable. While ML approaches.
19
Fig. 8. Adversarial Risk Grid Map
Since adversarial examples represent a worst-case scenario when using hard targets. They proved in their experiments
of a distribution shift, the task of generating an adversar- that ensemble model is able to transfer knowledge to the
ial sample is a non-convex optimization problem that can distilled model better than individual models. However, en-
only be approximately solved. Adversarial attack methods semble requires large computation models that have large
are mostly optimization algorithms in search for a lower networks and large datasets. Therefore, they use learning
boundary perturbation that corresponds to an adversarial specialist models that each use a subset of dataset classes
sample [157] . These optimization algorithms often result to reduce the amount of computation [161]. Also, it was
in high frequency outputs [158]. This however makes the adapted by Papernot et al. [89] to defend against adversarial
defense methods against this adversarial samples vulnerable crafting by using the output of the original neural network
to adversarial samples that are generated within a low- to train a smaller network rather than using the distillation
frequency subspace. as originally proposed by Hinton. Defensive distillation was
In this section, we provide the most common defense initially tested against adversarial attacks in computer vision,
methods in use today and classify them based on the strategy but further research is required to determine its effectiveness
and approach. The reviewed defense methods are shown in in other applications such as malware detection.
Figure 9 below. 3) Adversarial Training: Adversarial training [6] is a
1) Gradient Masking: Since most method of adversarial method that aims to increase the robustness of a machine
attacks are based on the using of gradient, the gradient learning model to adversarial samples by minimizing the
masking method modifies a machine learning model in an loss L on data/label pairs {Xi , yi } while maximizing the
attempt to obscure its gradient from an attacker. Nayebi corresponding loss function. Szegedy et al. [44] originally
et al [159] demonstrated the effect of gradient masking by proposed a three-step method known as adversarial training
saturating the sigmoid network which results in a vanishing for defending against adversarial attacks. 1, Train the classi-
gradient effect in gradient-based attacks. Authors force the fier on the original dataset 2, Generate adversarial samples
neural networks to works in nonlinear saturating system. 3, Iterate additional training epochs using the adversarial
By using Jacobian regularization for each network layer samples. Generally, adversarial training is based on min-max
including the output layer, the model becomes non sensi- formulation that solves two problems: attacks as an inner
tive of perturbations that are generated using fast gradient maximization problem and defenses as an outer minimization
sign method (FGSM) and iterative adversarial attacks[159]. problem to achieve optimization [6]. The inner maximization
However, [160] indicate that gradient masking react as over- intents to generate adversarial samples version that results
fitting in their experiments. to maximize the model loss. Where the outer minimization
2) Defensive Distillation: Distillation technique was orig- intents to minimize the loss by finding model parameters
inally proposed by Hinton et al. [161] for transferring that build a more robust model with less adversarial loss
knowledge from large neural networks to smaller ones. To [6]. Numerous researchers tested and evaluated the effect of
implement the distillation approach, Hinton et al. authors adversarial training in the network security domain [166],
built 10 DNN models with same architecture and training [167]. They concluded that it improves the classification
method and use soft targets to avoid overfitting that occur performance of the machine learning model and makes it
20
Adversarial Defense Methods
more resilient to adversarial crafting. 7) Input Randomization: Some researchers tried random-
However, adversarial training has certain limitations par- ization operations on the input of the model as a defense
ticularly in the context of adversarial machine learning in net- against adversarial attacks on machine learning, for example,
work security. First, the adversary may implement a different Xie et al. [164] tried random resizing and adding random
attack method other than the one which was used in training padding on inputs. Experiments demonstrated that their pro-
the network. Secondly, the adversary may design adversarial posed method is effective. Zhang et al. [56] also tried a
perturbations for a deep learning model that already has been similar method by injecting random Gaussian noise. These
trained with adversarial training, and craft new adversarial approaches have multiple advantages such as simplicity, low
perturbations which would make the previous adversarial computational complexity, and eliminating the need for addi-
training ineffective. It has also been shown that adversarial tional training. The main disadvantage is using this defense
training can reduce the performance of the deep learning technique in the network security domain could change the
models on clean inputs as discussed in [68]. functionality of the inputs (Executables, Packets, etc.). In our
opinion, this method needs to be evaluated in the network
4) Gradient Regularization: Gradient regularization is a
security domain.
technique that penalizes large changes in the output of some
8) Ensemble Defenses: Similar to the idea of ensem-
neural network layer, to adjust machine learning models,
ble learning which combines one or more machine learn-
minimize the loss function, increase model robustness and
ing techniques, researchers have also proposed the use of
prevent overfitting or underfitting. Many researchers tested
multiple defense strategies as a defense technique against
this approach as a defense against adversarial attacks, like
adversarial samples. PixelDefend was proposed by [165] to
Ros et al. [162] who found that training DNNs with gra-
combine adversarial detecting techniques with one or more
dient regularization improves the robustness to adversarial
other methods for creating a more robust defense against
perturbations as much or more than adversarial training. They
adversarial attacks.
have also found that combining both approaches (gradient
regularization and adversarial training) achieves greater ro- VIII. D ISCUSSION AND L ESSONS L EARNT
bustness. The main drawback of Gradient regularization is
This section discusses several key lessons learnt through
that it doubles the training time per batch.
our survey on adversarial attacks against ML in network
5) Detecting Adversarial Samples: Several approaches security.
are used to detect the presence of adversarial samples in the
training phase of a machine-learning model. One of such A. Increased Adversarial Risk
approaches proposed by [139] works on the premise that We observed an increased risk of adversarial vulnerability
adversarial samples have a higher uncertainty than clean data of machine learning models in network security with reduced
and uses a Bayesian neural network that is in dropout layers discriminative autonomy and directive autonomy. Similarly,
of neural networks to estimate the extent of uncertainty in we observed a reduced risk of adversarial vulnerability with
the input data to detect the adversarial samples. Other ap- increased discriminative autonomy and directive autonomy.
proaches include the use of probability divergence proposed As illustrated in the adversarial risk grid map shown in
by [168] as well as the use of an auxiliary network of the Figure 8, the discriminative autonomy directly relates to
original network introduced by Metzen et al. in [169]. Ren the machine learning tasks while the directive autonomy
et al. [170] also proposed adversarial attack detection and relates to the machine learning technique. The reason for
adversarial sample recognition methods by using the causal the adversarial sensitivity of the machine learning models
inference technique to establish a causal model to describe to the discriminative and directive autonomy based risk grid
the generation and performance of adversarial samples that map is still an area of open research.
attack DNNs. Previous approaches on making machine learning in net-
6) Feature Reduction: Other potential defenses for adver- work security more secure have advocated the development
sarial attacks have been proposed. Simple feature reduction of machine learning models that are resilient to adversarial
was evaluated by Grosse et al. [163] but was found inade- attacks. In this survey, we introduced the concept of an
quate in defending against adversarial attacks. element of reduced risk of adversarial attacks based on an
21
adversarial risk grid map. Our findings suggest that the in malware machine learning models in machine learning
adversarial risk grid map provides a promising future for applications such as malware detection.
the security of artificial intelligence and machine learning in
network security. Machine learning based network security C. Malware Detection Approaches
applications that are more resilient to adversarial attacks In the majority of cases, Android malware detection is
can be designed by leveraging on the adversarial risk grid posed as a binary classification problem in which a classifier
map. We observed that the misclassification achieved by an is used to determine whether an app is malicious or not.
adversarial attack is dependent significantly on the design Malware detection take three general approaches which are
of the adversarial attack algorithm with the context of each dynamic, static, or hybrid. Significant overhead is usually
specific attack . White-box, Evasion attacks against endpoint required in order to extract dynamic features because it
protection systems (malware detection) are the most common requires monitoring the behavior of apps at run time. Several
attacks. While there is limited research in adversarial attacks of the studies we examined have focused on instances
against process behavior and user behavior analysis, use in which static features were extracted, including required
cases of machine learning in network security, endpoint permissions, actions, and application programming interface
protection, network protection and application security have (API) calls. In our literature review, we did not come across
been well researched. any work in which adversarial attacks were successfully
carried out against machine learning based malware detection
B. Transferability with regards to machine learning tech- systems in which dynamic features were extracted.
nique
Transferability of adversarial samples [72] [171] has been D. Quantitative evaluation of adversarial attacks
shown to be more effective with targeted adversarial samples In network security, majority of the adversarial attacks
[73]. This implies that non-targeted adversarial samples reported target the integrity aspect of the CIA triad, with the
(reliability attacks) which are solely aimed at causing a intent of causing a misclassification. A quantitative analysis
misclassification, are more likely to transfer from one model of the attacks’ efficiency for the four reviewed categories
to the other. In furtherance to this phenomenon, we observe (malware detection, phishing detection, spam detection, and
that adversarial attacks in network security are less likely network anomaly detection) was observed. After calculating
to transfer from one machine learning technique to another. the average attack success rate per class, we have found that
Transferability of adversarial defences in network security the most significant adversarial effect was in the malware
in also impacted by to the heterogenous nature of the detection and the network anomaly detection domains, in
perturbed features. While this is has a positive side with which the adversarial attacks’ success rates averaged more
regards to preventing transferable defenses, it also makes it than 90 percent. It is worth mentioning that we think that
more difficult in real world situations. From our observation, a number of these attacks are theoretical and need more
adversarial attacks in problem space are more difficult to investigation to deploy them in practical settings, thus the
generate, more difficult to defend against and less chances quantitative effect of some of the reviewed attacks could
of being transferable. be exaggerated. However, we find these results as a good
In our research, we observed that a significant amount indication of the malicious potential of adversarial attacks
of features are perturbed in the process of generating the on network security domains.
adversarial sample. This is a sub optimal approach. There The challenge of quantifying the efficiency of adversarial
is currently no publication which has explored the challenge example generation, is an emerging field and several ap-
of finding a way to identify the ideal features that need to proaches have been proposed in recent literature. In [173]
be perturbed for creating adversarial samples. In the field of a new performance metric was proposed, called effective
computer vision, Guo et al. [158] restricted the search for generation rate (EGR) which is the ratio between n and
adversarial samples to the low frequency domain, thereby n, i.e., n /n. Where n represents the number of adversarial
reducing query complexity. examples generated by an attacker and n denotes the
We reviewed defenses against adversarial attacks on ma- number of adversarial examples that successfully evades both
chine learning applications in network security. We note that malware and adversarial example detection.
there are two major limitations in the existing research on
adversarial defenses. Firstly, most defenses are designed to E. Difference between adversarial attacks in network secu-
protect against attacks on machine learning applications in rity and computer vision
computer vision. Secondly, the defenses studied are usually 1) In image recognition, the primary feature used in
designed for a specific attack or a part of the attack. A adversarial perturbation is the pixels of the image.
generalized defense model against adversarial attacks is However, in network security, there is a great variation
at best still theoretical as research on generalized defense in the types of features which may be used, and as
models is in early stages [172]. Furthermore, our findings such, the perturbation scope for adversarial attacks
indicate that defenses against adversarial attacks are specific becomes largely increased.
to a particular type of attack and are not necessarily trans- 2) Adversarial attacks in network security differs from
ferable. Recent research [72] have studied the transferability computer vision since data objects are considered
22
rather than images. As a result, the perturbed features reviewed literature of over fifty attacks against machine
are more diverse and heterogeneous. The consequence learning in network security, there has been no attempt to
of this is that it becomes more difficult to defend implement adversarial attacks against any other task in net-
against adversarial attacks in network security due to work security except classification and clustering tasks. This
the heterogeneity , hence, transferability and universal is consistent with our adversarial risk grid map illustrated
defenses against . It should be noted that significant in Figure 8 in which we posit that adversarial risk increases
strides have been made in computer vision, with re- based on the type of network security task which is being
gards to developing universal defences but this is still performed. Our study notes that there are diverse adversaries
an infant research area in network security. Also, the in network security compared to computer vision. as such,
feature varies greatly based on the network security there is even more relevant arms race situation in network
application. In most cases, the features used in the ma- security than in computer vision
chine learning classification are also the features that Several authors have shown that deep learning can be
are perturbed in generating the adversarial samples. performed on data that is encrypted[178] [179] [180]. But
in our study, we observe that encrypted data has not been
IX. C ONCLUSION AND FUTURE W ORK adversarial defeated. Even though, most data in network
We present a first of its kind survey on adversarial attacks security is encrypted, adversarial attacks or the ability to
on machine learning in network security. The previous survey generate adversarial samples against encrypted data is an area
[17] that we reviewed had only discussed adversarial attacks of open research. As such, it is a promising idea, subject
against deep learning in computer vision. We introduced a to future research, to stipulate that performing encryption
new classification for adversarial attacks based on applica- before applying machine learning to the data, is a trusted
tions of machine learning in network security and developed and proven defense against adversarial machine learning in
a matrix to correlate the various types of adversarial attacks network security.
with a taxonomy-based classification to determine their ef- The use of deep learning as a technique for encryption
fectiveness in causing a misclassification. We also presented is quite restrictive [181]. This is mostly due to the com-
a novel idea of the concept of an adversarial risk grid map putational costs of deep learning. Research is also required
for machine learning in network security. to understand the effects of adversarial attacks against deep
In our review on defenses against adversarial attacks, learning for encryption.
although there were numerous proposed defenses against
ACKNOWLEDGEMENT
specific adversarial attacks, research on generalized defenses
against adversarial attacks is still not well established [172]. This work was supported by the Natural Sciences and
In our future work, we would study generalized defenses Engineering Research Council of Canada (NSERC) through
against adversarial attacks to understand if a generalized the NSERC Discovery Grant program.
approach towards adversarial defenses will be effectively R EFERENCES
attainable. In addition, we would examine the interpretability
[1] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harness-
of the adversarial risk to further understand why the reduced ing adversarial examples (2014),” arXiv preprint arXiv:1412.6572.
adversarial vulnerability occurs, and its implications for other [2] J. Kos, I. Fischer, and D. Song, “Adversarial examples for generative
applications of machine learning such as computer vision and models,” in 2018 IEEE Security and Privacy Workshops (SPW),
pp. 36–42, IEEE, 2018.
natural language processing. [3] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool:
Future Work Based on our research, adversarial attack a simple and accurate method to fool deep neural networks,” in
has mostly been carried out on data at rest, with very Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 2574–2582, 2016.
few successful attempts of adversarial attacks on data in [4] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and
transit, or streaming data such as [174] [175]. However, in A. Swami, “The limitations of deep learning in adversarial settings,”
network security domains such as in the field of intrusion in Security and Privacy (EuroS&P), 2016 IEEE European Symposium
on, pp. 372–387, IEEE, 2016.
detection, realistic adversarial attacks will be carried out on [5] I. Corona, G. Giacinto, and F. Roli, “Adversarial attacks against
data in transit. Hence, more research is needed in this area to intrusion detection systems: Taxonomy, solutions and open issues,”
understand the potential risks of adversarial attacks against Information Sciences, vol. 239, pp. 201–225, 2013.
[6] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu,
data in transit and the possible defense techniques. “Towards deep learning models resistant to adversarial attacks,” arXiv
Adversarial attack against federated learning [176] is still preprint arXiv:1706.06083, 2017.
an open area of research. Federated learning [177], which [7] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and
A. Swami, “Practical black-box attacks against machine learning,” in
is quite different from distributed computation, involves the Proceedings of the 2017 ACM on Asia Conference on Computer and
situation in which each client performs the machine learning Communications Security, pp. 506–519, ACM, 2017.
computation without sending the data to the cloud. As such, [8] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar,
“Can machine learning be secure?,” in Proceedings of the 2006 ACM
the cloud provider does not have a complete view of the Symposium on Information, computer and communications security,
machine learning model with significant gains for privacy pp. 16–25, ACM, 2006.
and confidentiality. [9] I. Rosenberg, A. Shabtai, Y. Elovici, and L. Rokach, “Adversarial
machine learning attacks and defense methods in the cyber security
Adversarial attacks were demonstrated to affect only clas- domain,” ACM Computing Surveys (CSUR), vol. 54, no. 5, pp. 1–36,
sifier and clustering tasks in network security. From the 2021.
23
[10] V. Duddu, “A survey of adversarial machine learning in cyber [33] Y. Li, R. Ma, and R. Jiao, “A hybrid malicious code detection method
warfare,” Defence Science Journal, vol. 68, no. 4, pp. 356–366, 2018. based on deep learning,” methods, vol. 9, no. 5, 2015.
[11] C. Zhang, P. Patras, and H. Haddadi, “Deep learning in mobile and [34] R. Basnet, S. Mukkamala, and A. H. Sung, “Detection of phishing
wireless networking: A survey,” IEEE Communications Surveys & attacks: A machine learning approach,” in Soft Computing Applica-
Tutorials, 2019. tions in Industry, pp. 373–383, Springer, 2008.
[12] B. Biggio and F. Roli, “Wild patterns: Ten years after the rise of [35] Y. Dong, Y. Zhang, H. Ma, Q. Wu, Q. Liu, K. Wang, and W. Wang,
adversarial machine learning,” Pattern Recognition, vol. 84, pp. 317– “An adaptive system for detecting malicious queries in web attacks,”
331, 2018. Science China Information Sciences, vol. 61, no. 3, p. 032114, 2018.
[13] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, [36] H. Le, Q. Pham, D. Sahoo, and S. C. Hoi, “Urlnet: Learning a url
and D. Mané, “Concrete problems in ai safety,” arXiv preprint representation with deep learning for malicious url detection,” arXiv
arXiv:1606.06565, 2016. preprint arXiv:1802.03162, 2018.
[14] Y. Vorobeychik and M. Kantarcioglu, “Adversarial machine learning,” [37] K. Revett, F. Gorunescu, M. Gorunescu, M. Ene, P. S. T. Magalhães,
Synthesis Lectures on Artificial Intelligence and Machine Learning, and H. D. d. Santos, “A machine learning approach to keystroke
vol. 12, no. 3, pp. 1–169, 2018. dynamics based user authentication,” 2007.
[15] Z. Wang, “Deep learning-based intrusion detection with adversaries,” [38] E. Bursztein, M. Martin, and J. Mitchell, “Text-based captcha
IEEE Access, vol. 6, pp. 38367–38384, 2018. strengths and weaknesses,” in Proceedings of the 18th ACM confer-
[16] X. Liu, Y. Lin, H. Li, and J. Zhang, “Adversarial examples: Attacks ence on Computer and communications security, pp. 125–138, ACM,
on machine learning-based malware visualization detection methods,” 2011.
arXiv preprint arXiv:1808.01546, 2018. [39] K. Lee, J. Caverlee, and S. Webb, “Uncovering social spammers:
[17] N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learn- social honeypots+ machine learning,” in Proceedings of the 33rd
ing in computer vision: A survey,” arXiv preprint arXiv:1801.00553, international ACM SIGIR conference on Research and development
2018. in information retrieval, pp. 435–442, ACM, 2010.
[18] S. Qiu, Q. Liu, S. Zhou, and C. Wu, “Review of artificial intelligence [40] M. Kravchik and A. Shabtai, “Anomaly detection; industrial
adversarial attack and defense technologies,” Applied Sciences, vol. 9, control systems; convolutional neural networks,” arXiv preprint
no. 5, p. 909, 2019. arXiv:1806.08110, 2018.
[19] Q. Liu, P. Li, W. Zhao, W. Cai, S. Yu, and V. C. Leung, “A survey [41] Y. Kou, C.-T. Lu, S. Sirwongwattana, and Y.-P. Huang, “Survey of
on security threats and defensive techniques of machine learning: A fraud detection techniques,” in Networking, sensing and control, 2004
data driven view,” IEEE access, vol. 6, pp. 12103–12117, 2018. IEEE international conference on, vol. 2, pp. 749–754, IEEE, 2004.
[20] A. L. Buczak and E. Guven, “A survey of data mining and ma- [42] Y. G. Şahin and E. Duman, “Detecting credit card fraud by decision
chine learning methods for cyber security intrusion detection,” IEEE trees and support vector machines,” 2011.
Communications Surveys & Tutorials, vol. 18, no. 2, pp. 1153–1176,
[43] E. Prouff, R. Strullu, R. Benadjila, E. Cagli, and C. Dumas, “Study of
2015.
deep learning techniques for side-channel analysis and introduction to
[21] J. Gardiner and S. Nagaraja, “On the security of machine learning in ascad database.,” IACR Cryptology ePrint Archive, vol. 2018, p. 53,
malware c&c detection: A survey,” ACM Computing Surveys (CSUR), 2018.
vol. 49, no. 3, p. 59, 2016.
[44] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfel-
[22] H. X. Y. M. Hao-Chen, L. D. Deb, H. L. J.-L. T. Anil, and K. Jain,
low, and R. Fergus, “Intriguing properties of neural networks,” arXiv
“Adversarial attacks and defenses in images, graphs and text: A
preprint arXiv:1312.6199, 2013.
review,” International Journal of Automation and Computing, vol. 17,
[45] L. Chen and Y. Ye, “Secmd: Make machine learning more secure
no. 2, pp. 151–178, 2020.
against adversarial malware attacks,” in Australasian Joint Confer-
[23] W. E. Zhang, Q. Z. Sheng, A. Alhazmi, and C. Li, “Adversarial
ence on Artificial Intelligence, pp. 76–89, Springer, 2017.
attacks on deep-learning models in natural language processing: A
survey,” ACM Transactions on Intelligent Systems and Technology [46] L. Chen, S. Hou, Y. Ye, and S. Xu, “Droideye: Fortifying security of
(TIST), vol. 11, no. 3, pp. 1–41, 2020. learning-based classifier against adversarial android malware attacks,”
[24] L. Sun, M. Tan, and Z. Zhou, “A survey of practical adversarial in 2018 IEEE/ACM International Conference on Advances in Social
example attacks,” Cybersecurity, vol. 1, no. 1, p. 9, 2018. Networks Analysis and Mining (ASONAM), pp. 782–789, IEEE, 2018.
[25] V. Ford and A. Siraj, “Applications of machine learning in cyber [47] L. Chen, Y. Ye, and T. Bourlai, “Adversarial machine learning in
security,” in Proceedings of the 27th International Conference on malware detection: Arms race between evasion attack and defense,”
Computer Applications in Industry and Engineering, 2014. in Intelligence and Security Informatics Conference (EISIC), 2017
[26] J. Singh and M. J. Nene, “A survey on machine learning techniques European, pp. 99–106, IEEE, 2017.
for intrusion detection systems,” International Journal of Advanced [48] B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto,
Research in Computer and Communication Engineering, vol. 2, C. Eckert, and F. Roli, “Adversarial malware binaries: Evading
no. 11, pp. 4349–4355, 2013. deep learning for malware detection in executables,” arXiv preprint
[27] M. Almseidin, M. Alzubi, S. Kovacs, and M. Alkasassbeh, “Evalua- arXiv:1803.04173, 2018.
tion of machine learning algorithms for intrusion detection system,” [49] L. Muñoz-González, B. Biggio, A. Demontis, A. Paudice, V. Won-
in Intelligent Systems and Informatics (SISY), 2017 IEEE 15th grassamee, E. C. Lupu, and F. Roli, “Towards poisoning of deep
International Symposium on, pp. 000277–000282, IEEE, 2017. learning algorithms with back-gradient optimization,” in Proceedings
[28] A.-C. Sima, K. Stockinger, K. Affolter, M. Braschler, P. Monte, and of the 10th ACM Workshop on Artificial Intelligence and Security,
L. Kaiser, “A hybrid approach for alarm verification using stream pp. 27–38, 2017.
processing, machine learning and text analytics,” in International [50] B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov,
Conference on Extending Database Technology (EDBT), March 26- G. Giacinto, and F. Roli, “Evasion attacks against machine learning
29, 2018, ACM, 2018. at test time,” in Joint European conference on machine learning and
[29] P. Laskov, P. Düssel, C. Schäfer, and K. Rieck, “Learning intrusion knowledge discovery in databases, pp. 387–402, Springer, 2013.
detection: supervised or unsupervised?,” in International Conference [51] J. Jo and Y. Bengio, “Measuring the tendency of cnns to learn surface
on Image Analysis and Processing, pp. 50–57, Springer, 2005. statistical regularities,” arXiv preprint arXiv:1711.11561, 2017.
[30] B. Kolosnjaji, A. Zarras, G. Webster, and C. Eckert, “Deep learning [52] D. Hendrycks and T. Dietterich, “Benchmarking neural network ro-
for classification of malware system call sequences,” in Australasian bustness to common corruptions and perturbations,” in International
Joint Conference on Artificial Intelligence, pp. 137–149, Springer, Conference on Learning Representations, 2018.
2016. [53] L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, and J. Tygar,
[31] K. Rieck, P. Trinius, C. Willems, and T. Holz, “Automatic analysis “Adversarial machine learning,” in Proceedings of the 4th ACM
of malware behavior using machine learning,” Journal of Computer workshop on Security and artificial intelligence, pp. 43–58, ACM,
Security, vol. 19, no. 4, pp. 639–668, 2011. 2011.
[32] A. Kumara and C. Jaidhar, “Automated multi-level malware detection [54] Y. Fan, B. Wu, T. Li, Y. Zhang, M. Li, Z. Li, and Y. Yang, “Sparse
system based on reconstructed semantic view of executables using adversarial attack via perturbation factorization,” in European con-
machine learning techniques at vmm,” Future Generation Computer ference on computer vision, pp. 35–50, Springer, 2020.
Systems, vol. 79, pp. 431–446, 2018. [55] W. Hu and Y. Tan, “Black-box attacks against rnn based malware
24
detection algorithms,” in Workshops at the Thirty-Second AAAI Conference on Computer Vision and Pattern Recognition, pp. 4954–
Conference on Artificial Intelligence, 2018. 4963, 2019.
[56] Y. Zhang and P. Liang, “Defending against whitebox adversarial [76] B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against
attacks via randomized discretization,” in The 22nd International support vector machines,” arXiv preprint arXiv:1206.6389, 2012.
Conference on Artificial Intelligence and Statistics, pp. 684–693, [77] T. Gu, B. Dolan-Gavitt, and S. Garg, “Badnets: Identifying vulnera-
PMLR, 2019. bilities in the machine learning model supply chain,” arXiv preprint
[57] N. Carlini, A. Athalye, N. Papernot, W. Brendel, J. Rauber, arXiv:1708.06733, 2017.
D. Tsipras, I. Goodfellow, A. Madry, and A. Kurakin, “On evaluating [78] S. Sabour, Y. Cao, F. Faghri, and D. J. Fleet, “Adversarial manip-
adversarial robustness,” arXiv preprint arXiv:1902.06705, 2019. ulation of deep representations,” arXiv preprint arXiv:1511.05122,
[58] F. Pierazzi, F. Pendlebury, J. Cortellazzi, and L. Cavallaro, “Intriguing 2015.
properties of adversarial ml attacks in the problem space,” in 2020 [79] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in
IEEE Symposium on Security and Privacy (SP), pp. 1332–1349, the physical world,” arXiv preprint arXiv:1607.02533, 2016.
IEEE, 2020. [80] U. Jang, X. Wu, and S. Jha, “Objective metrics and gradient descent
[59] A. Pattanaik, Z. Tang, S. Liu, G. Bommannan, and G. Chowdhary, algorithms for adversarial examples in machine learning,” in Proceed-
“Robust deep reinforcement learning with adversarial attacks,” in ings of the 33rd Annual Computer Security Applications Conference,
Proceedings of the 17th International Conference on Autonomous pp. 262–277, 2017.
Agents and MultiAgent Systems, pp. 2040–2042, International Foun- [81] P.-Y. Chen, Y. Sharma, H. Zhang, J. Yi, and C.-J. Hsieh, “Ead: elastic-
dation for Autonomous Agents and Multiagent Systems, 2018. net attacks to deep neural networks via adversarial examples,” in
[60] E. Tabassi, K. J. Burns, M. Hadjimichael, A. D. Molina-Markham, Thirty-second AAAI conference on artificial intelligence, 2018.
and J. T. Sexton, “A taxonomy and terminology of adversarial [82] N. Carlini and D. Wagner, “Towards evaluating the robustness of
machine learning,” 2019. neural networks,” in 2017 IEEE Symposium on Security and Privacy
[61] A. Shafahi, W. R. Huang, M. Najibi, O. Suciu, C. Studer, T. Dumitras, (SP), pp. 39–57, IEEE, 2017.
and T. Goldstein, “Poison frogs! targeted clean-label poisoning [83] F. Croce and M. Hein, “Reliable evaluation of adversarial robustness
attacks on neural networks,” in Advances in Neural Information with an ensemble of diverse parameter-free attacks,” arXiv preprint
Processing Systems, pp. 6103–6113, 2018. arXiv:2003.01690, 2020.
[62] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart, [84] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard,
“Stealing machine learning models via prediction apis,” in 25th “Universal adversarial perturbations,” in Proceedings of the IEEE
{USENIX} Security Symposium ({USENIX} Security 16), pp. 601– conference on computer vision and pattern recognition, pp. 1765–
618, 2016. 1773, 2017.
[63] N. Papernot, P. McDaniel, A. Sinha, and M. P. Wellman, “Sok: [85] A. Ghiasi, A. Shafahi, and T. Goldstein, “Breaking certified defenses:
Security and privacy in machine learning,” in 2018 IEEE European Semantic adversarial examples with spoofed robustness certificates,”
Symposium on Security and Privacy (EuroS&P), pp. 399–414, IEEE, arXiv preprint arXiv:2003.08937, 2020.
2018. [86] T. B. Brown, D. Mané, A. Roy, M. Abadi, and J. Gilmer, “Adversarial
[64] B. Kim, Y. E. Sagduyu, K. Davaslioglu, T. Erpek, and S. Ulukus, patch,” arXiv preprint arXiv:1712.09665, 2017.
“Over-the-air adversarial attacks on deep learning based modulation [87] F. Croce, M. Andriushchenko, and M. Hein, “Provable robustness
classifier over wireless channels,” in 2020 54th Annual Conference of relu networks via maximization of linear regions,” in the 22nd
on Information Sciences and Systems (CISS), pp. 1–6, IEEE, 2020. International Conference on Artificial Intelligence and Statistics,
[65] M. Usama, M. Asim, S. Latif, J. Qadir, et al., “Generative adversarial pp. 2057–2066, 2019.
networks for launching and thwarting adversarial attacks on net- [88] X. Yuan, P. He, Q. Zhu, R. R. Bhat, and X. Li, “Adversarial
work intrusion detection systems,” in 2019 15th International Wire- examples: Attacks and defenses for deep learning,” arXiv preprint
less Communications & Mobile Computing Conference (IWCMC), arXiv:1712.07107, 2017.
pp. 78–83, IEEE, 2019. [89] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distil-
[66] M. Juuti, S. Szyller, S. Marchal, and N. Asokan, “Prada: protecting lation as a defense to adversarial perturbations against deep neural
against dnn model stealing attacks,” in 2019 IEEE European Sympo- networks,” in 2016 IEEE Symposium on Security and Privacy (SP),
sium on Security and Privacy (EuroS&P), pp. 512–527, IEEE, 2019. pp. 582–597, IEEE, 2016.
[67] A. Albaseer, B. S. Ciftler, and M. M. Abdallah, “Performance evalu- [90] M. Mosbach, M. Andriushchenko, T. Trost, M. Hein, and D. Klakow,
ation of physical attacks against e2e autoencoder over rayleigh fading “Logit pairing methods can fool gradient-based attacks,” arXiv
channel,” in 2020 IEEE International Conference on Informatics, IoT, preprint arXiv:1810.12042, 2018.
and Enabling Technologies (ICIoT), pp. 177–182, IEEE, 2020. [91] F. Croce and M. Hein, “Minimally distorted adversarial ex-
[68] M. Sadeghi and E. G. Larsson, “Physical adversarial attacks against amples with a fast adaptive boundary attack,” arXiv preprint
end-to-end autoencoder communication systems,” IEEE Communica- arXiv:1907.02044, 2019.
tions Letters, vol. 23, no. 5, pp. 847–850, 2019. [92] J. Chen, M. I. Jordan, and M. J. Wainwright, “Hopskipjumpat-
[69] A. Ilyas, L. Engstrom, A. Athalye, and J. Lin, “Black-box adver- tack: A query-efficient decision-based attack,” arXiv preprint
sarial attacks with limited queries and information,” arXiv preprint arXiv:1904.02144, 2019.
arXiv:1804.08598, 2018. [93] A. K. Das, S. Zeadally, and D. He, “Taxonomy and analysis of se-
[70] M. Jagielski, N. Carlini, D. Berthelot, A. Kurakin, and N. Papernot, curity protocols for internet of things,” Future Generation Computer
“High accuracy and high fidelity extraction of neural networks,” in Systems, vol. 89, pp. 110–125, 2018.
29th {USENIX} Security Symposium ({USENIX} Security 20), 2020. [94] S. Hansman and R. Hunt, “A taxonomy of network and computer
[71] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks attacks,” Computers & Security, vol. 24, no. 1, pp. 31–43, 2005.
that exploit confidence information and basic countermeasures,” in [95] “Total malware.”
Proceedings of the 22nd ACM SIGSAC Conference on Computer and [96] A. Calleja, A. Martín, H. D. Menéndez, J. Tapiador, and D. Clark,
Communications Security, pp. 1322–1333, 2015. “Picking on the family: Disrupting android malware triage by forc-
[72] N. Papernot, P. McDaniel, and I. Goodfellow, “Transferability in ing misclassification,” Expert Systems with Applications, vol. 95,
machine learning: from phenomena to black-box attacks using ad- pp. 113–126, 2018.
versarial samples,” arXiv preprint arXiv:1605.07277, 2016. [97] O. Suciu, R. Marginean, Y. Kaya, H. Daume III, and T. Dumitras,
[73] Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into transfer- “When does machine learning {FAIL}? generalized transferability
able adversarial examples and black-box attacks,” arXiv preprint for evasion and poisoning attacks,” in 27th {USENIX} Security
arXiv:1611.02770, 2016. Symposium ({USENIX} Security 18), pp. 1299–1316, 2018.
[74] J. R. Correia-Silva, R. F. Berriel, C. Badue, A. F. de Souza, and [98] K. S. Han, J. H. Lim, B. Kang, and E. G. Im, “Malware analysis
T. Oliveira-Santos, “Copycat cnn: Stealing knowledge by persuading using visualized images and entropy graphs,” International Journal
confession with random non-labeled data,” in 2018 International of Information Security, vol. 14, no. 1, pp. 1–14, 2015.
Joint Conference on Neural Networks (IJCNN), pp. 1–8, IEEE, 2018. [99] H. Bostani and V. Moonsamy, “Evadedroid: A practical evasion
[75] T. Orekondy, B. Schiele, and M. Fritz, “Knockoff nets: Stealing attack on machine learning for black-box android malware detection,”
functionality of black-box models,” in Proceedings of the IEEE arXiv preprint arXiv:2110.03301, 2021.
25
[100] W. Hu and Y. Tan, “Generating adversarial malware examples for importance guided attack: A model agnostic adversarial attack,” arXiv
black-box attacks based on gan,” arXiv preprint arXiv:1702.05983, preprint arXiv:2106.14815, 2021.
2017. [123] A. AlEroud and G. Karabatis, “Bypassing detection of url-based
[101] J. Yuan, S. Zhou, L. Lin, F. Wang, and J. Cui, “Black-box adversarial phishing attacks using generative adversarial deep neural networks,”
attacks against deep learning based malware binaries detection with in Proceedings of the Sixth International Workshop on Security and
gan,” in ECAI 2020, pp. 2536–2542, IOS Press, 2020. Privacy Analytics, pp. 53–60, 2020.
[102] A. Al-Dujaili et al., “Adversarial deep learning for robust detection [124] R. Al-Qurashi, A. AlEroud, A. A. Saifan, M. Alsmadi, and I. Als-
of binary encoded malware,” in 2018 IEEE Security and Privacy madi, “Generating optimal attack paths in generative adversarial
Workshops (SPW), pp. 76–82, IEEE, 2018. phishing,” in 2021 IEEE International Conference on Intelligence
[103] F. Kreuk, A. Barak, S. Aviv-Reuven, M. Baruch, B. Pinkas, and and Security Informatics (ISI), pp. 1–6, IEEE, 2021.
J. Keshet, “Deceiving end-to-end deep learning malware detectors [125] F. Song, Y. Lei, S. Chen, L. Fan, and Y. Liu, “Advanced evasion
using adversarial examples,” arXiv preprint arXiv:1802.04528, 2018. attacks and mitigations on practical ml-based phishing website clas-
[104] F. Kreuk, A. Barak, S. Aviv-Reuven, M. Baruch, B. Pinkas, and sifiers,” International Journal of Intelligent Systems, vol. 36, no. 9,
J. Keshet, “Adversarial examples on discrete sequences for beating pp. 5210–5240, 2021.
whole-binary malware detection,” arXiv preprint arXiv:1802.04528, [126] Z. Lin, Y. Shi, and Z. Xue, “Idsgan: Generative adversarial networks
pp. 490–510, 2018. for attack generation against intrusion detection,” arXiv preprint
[105] E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. K. arXiv:1809.02077, 2018.
Nicholas, “Malware detection by eating a whole exe,” in Workshops [127] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” arXiv
at the Thirty-Second AAAI Conference on Artificial Intelligence, preprint arXiv:1701.07875, 2017.
2018. [128] I. Homoliak, M. Teknos, M. Ochoa, D. Breitenbacher, S. Hosseini,
[106] Y. Qiao, W. Zhang, Z. Tian, L. T. Yang, Y. Liu, and M. Alazab, “Ad- and P. Hanacek, “Improving network intrusion detection classifiers by
versarial malware sample generation method based on the prototype non-payload-based exploit-independent obfuscations: An adversarial
of deep learning detector,” Computers & Security, p. 102762, 2022. approach,” arXiv preprint arXiv:1805.02684, 2018.
[107] E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and [129] H. Zhang, X. Yu, P. Ren, C. Luo, and G. Min, “Deep adversar-
C. Nicholas, “Malware detection by eating a whole exe,” arXiv ial learning in intrusion detection: A data augmentation enhanced
preprint arXiv:1710.09435, 2017. framework,” arXiv preprint arXiv:1901.07949, 2019.
[108] O. Suciu, S. E. Coull, and J. Johns, “Exploring adversarial examples [130] J. Aiken and S. Scott-Hayward, “Investigating adversarial attacks
in malware detection,” arXiv preprint arXiv:1810.08280, 2018. against network intrusion detection systems in sdns,” in 2019 IEEE
[109] J. W. Stokes, D. Wang, M. Marinescu, M. Marino, and B. Bussone, Conference on Network Function Virtualization and Software Defined
“Attack and defense of dynamic analysis-based, adversarial neural Networks (NFV-SDN), pp. 1–7, IEEE, 2019.
malware detection models,” in MILCOM 2018-2018 IEEE Military [131] Y. E. Sagduyu, Y. Shi, and T. Erpek, “Iot network security from
Communications Conference (MILCOM), pp. 1–8, IEEE, 2018. the perspective of adversarial deep learning,” in 2019 16th Annual
IEEE International Conference on Sensing, Communication, and
[110] M. Crawford, T. M. Khoshgoftaar, J. D. Prusa, A. N. Richter, and
Networking (SECON), pp. 1–9, IEEE, 2019.
H. Al Najada, “Survey of review spam detection using machine
learning techniques,” Journal of Big Data, vol. 2, no. 1, pp. 1–24, [132] F. Liang, C. Shen, and F. Wu, “An iterative bp-cnn architecture
2015. for channel decoding,” IEEE Journal of Selected Topics in Signal
Processing, vol. 12, no. 1, pp. 144–159, 2018.
[111] N. Dalvi, P. Domingos, S. Sanghai, and D. Verma, “Adversarial clas-
[133] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos,
sification,” in Proceedings of the tenth ACM SIGKDD international
“Learning to optimize: Training deep neural networks for wireless
conference on Knowledge discovery and data mining, pp. 99–108,
resource management,” in 2017 IEEE 18th International Workshop on
2004.
Signal Processing Advances in Wireless Communications (SPAWC),
[112] D. Lowd and C. Meek, “Good word attacks on statistical spam
pp. 1–6, IEEE, 2017.
filters.,” in CEAS, vol. 2005, 2005.
[134] T. J. O’Shea, J. Corgan, and T. C. Clancy, “Convolutional radio
[113] M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, “A bayesian modulation recognition networks,” in International conference on
approach to filtering junk e-mail,” in Learning for Text Categoriza- engineering applications of neural networks, pp. 213–226, Springer,
tion: Papers from the 1998 workshop, vol. 62, pp. 98–105, Madison, 2016.
Wisconsin, 1998.
[135] T. J. O’Shea, T. Roy, and T. C. Clancy, “Over-the-air deep learning
[114] B. Nelson, M. Barreno, F. J. Chi, A. D. Joseph, B. I. Rubinstein, based radio signal classification,” IEEE Journal of Selected Topics in
U. Saini, C. Sutton, J. D. Tygar, and K. Xia, “Exploiting machine Signal Processing, vol. 12, no. 1, pp. 168–179, 2018.
learning to subvert your spam filter.,” LEET, vol. 8, no. 1, p. 9, 2008. [136] J. Canedo and A. Skjellum, “Using machine learning to secure iot
[115] G. Wang, T. Wang, H. Zheng, and B. Y. Zhao, “Man vs. machine: systems,” in 2016 14th annual conference on privacy, security and
Practical adversarial detection of malicious crowdsourcing workers,” trust (PST), pp. 219–222, IEEE, 2016.
in 23rd {USENIX} Security Symposium ({USENIX} Security 14), [137] Y. Meidan, M. Bohadana, A. Shabtai, J. D. Guarnizo, M. Ochoa,
pp. 239–254, 2014. N. O. Tippenhauer, and Y. Elovici, “Profiliot: a machine learning ap-
[116] P. Negi, A. Sharma, and C. Robustness, “Adversarial machine learn- proach for iot device identification based on network traffic analysis,”
ing against keystroke dynamics,” 2017. in Proceedings of the symposium on applied computing, pp. 506–509,
[117] M. F. Zeager, A. Sridhar, N. Fogal, S. Adams, D. E. Brown, and P. A. 2017.
Beling, “Adversarial learning in credit card fraud detection,” in 2017 [138] M. Miettinen, S. Marchal, I. Hafeez, N. Asokan, A.-R. Sadeghi,
Systems and Information Engineering Design Symposium (SIEDS), and S. Tarkoma, “Iot sentinel: Automated device-type identification
pp. 112–116, IEEE, 2017. for security enforcement in iot,” in 2017 IEEE 37th International
[118] C. Wang, D. Zhang, S. Huang, X. Li, and L. Ding, “Crafting Conference on Distributed Computing Systems (ICDCS), pp. 2177–
adversarial email content against machine learning based spam email 2184, IEEE, 2017.
detection,” in Proceedings of the 2021 International Symposium on [139] R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner, “Detecting
Advanced Security on Software and Systems, pp. 23–28, 2021. adversarial samples from artifacts,” arXiv preprint arXiv:1703.00410,
[119] G. Zhaoquan, X. Yushun, H. Weixiong, Y. Lihua, H. Yi, and 2017.
T. Zhihong, “Marginal attacks of generating adversarial examples [140] M. Sadeghi and E. G. Larsson, “Adversarial attacks on deep-learning
for spam filtering,” Chinese Journal of Electronics, vol. 30, no. 4, based radio signal classification,” IEEE Wireless Communications
pp. 595–602, 2021. Letters, vol. 8, no. 1, pp. 213–216, 2018.
[120] A. Phung and M. Stamp, “Universal adversarial perturbations and [141] G. Apruzzese, M. Colajanni, L. Ferretti, and M. Marchetti, “Address-
image spam classifiers,” in Malware Analysis Using Artificial Intel- ing adversarial attacks against security systems based on machine
ligence and Deep Learning, pp. 633–651, Springer, 2021. learning,” in 2019 11th International Conference on Cyber Conflict
[121] V. Shahrivari, M. M. Darabi, and M. Izadi, “Phishing detection us- (CyCon), vol. 900, pp. 1–18, IEEE, 2019.
ing machine learning techniques,” arXiv preprint arXiv:2009.11116, [142] Y. Shi, Y. E. Sagduyu, T. Erpek, K. Davaslioglu, Z. Lu, and J. H.
2020. Li, “Adversarial deep learning for cognitive radio security: Jamming
[122] G. Gressel, N. Hegde, A. Sreekumar, and M. Darling, “Feature attack and defense strategies,” in 2018 IEEE International Conference
26
on Communications Workshops (ICC Workshops), pp. 1–6, IEEE, [165] Y. Song, T. Kim, S. Nowozin, S. Ermon, and N. Kushman, “Pixelde-
2018. fend: Leveraging generative models to understand and defend against
[143] S. Kokalj-Filipovic, R. Miller, and J. Morman, “Targeted adversarial adversarial examples,” arXiv preprint arXiv:1710.10766, 2017.
examples against rf deep classifiers,” in Proceedings of the ACM [166] R. Abou Khamis et al., “Investigating resistance of deep learning-
Workshop on Wireless Security and Machine Learning, pp. 6–11, based ids against adversaries using min-max optimization,” IEEE ICC
2019. 20, 2019.
[144] Y. Peng, J. Su, X. Shi, and B. Zhao, “Evaluating deep learning based [167] R. Abou Khamis and A. Matrawy, “Evaluation of adversarial training
network intrusion detection system in adversarial environment,” in on different types of neural networks in deep learning-based idss,”
2019 IEEE 9th International Conference on Electronics Information in 2020 international symposium on networks, computers and com-
and Emergency Communication (ICEIEC), pp. 61–66, IEEE, 2019. munications (ISNCC), pp. 1–6, IEEE, 2020.
[145] O. Ibitoye, O. Shafiq, and A. Matrawy, “Analyzing adversarial attacks [168] D. Meng and H. Chen, “Magnet: a two-pronged defense against
against deep learning for intrusion detection in iot networks,” in 2019 adversarial examples,” in Proceedings of the 2017 ACM SIGSAC
IEEE Global Communications Conference (GLOBECOM), pp. 1–6, Conference on Computer and Communications Security, pp. 135–
IEEE, 2019. 147, ACM, 2017.
[146] M. Kloft and P. Laskov, “Online anomaly detection under adversarial [169] J. H. Metzen, T. Genewein, V. Fischer, and B. Bischoff, “On detecting
impact,” in Proceedings of the thirteenth international conference on adversarial perturbations,” arXiv preprint arXiv:1702.04267, 2017.
artificial intelligence and statistics, pp. 405–412, 2010. [170] M. Ren, Y.-L. Wang, and Z.-F. He, “Towards interpretable defense
[147] B. Biggio, G. Fumera, and F. Roli, “Security evaluation of pattern against adversarial attacks via causal inference,” Machine Intelligence
classifiers under attack,” IEEE transactions on knowledge and data Research, vol. 19, no. 3, pp. 209–226, 2022.
engineering, vol. 26, no. 4, pp. 984–996, 2013. [171] F. Tramèr, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel,
“The space of transferable adversarial examples,” arXiv preprint
[148] D. Wu, B. Fang, J. Wang, Q. Liu, and X. Cui, “Evading machine
arXiv:1704.03453, 2017.
learning botnet detection models via deep reinforcement learning,” in
[172] L. Schmidt, S. Santurkar, D. Tsipras, K. Talwar, and A. Madry,
ICC 2019-2019 IEEE International Conference on Communications
“Adversarially robust generalization requires more data,” in Advances
(ICC), pp. 1–6, IEEE, 2019.
in Neural Information Processing Systems, pp. 5014–5026, 2018.
[149] Q. Cheng, S. Zhou, Y. Shen, D. Kong, and C. Wu, “Packet-
[173] H. Li, S. Zhou, W. Yuan, J. Li, and H. Leung, “Adversarial-example
level adversarial network traffic crafting using sequence generative
attacks toward android malware detection system,” IEEE Systems
adversarial networks,” arXiv preprint arXiv:2103.04794, 2021.
Journal, vol. 14, no. 1, pp. 653–656, 2019.
[150] J. Chen, D. Wu, Y. Zhao, N. Sharma, M. Blumenstein, and S. Yu, [174] Y. Xie, C. Shi, Z. Li, J. Liu, Y. Chen, and B. Yuan, “Real-time,
“Fooling intrusion detection systems using adversarially autoen- universal, and robust adversarial attacks against speaker recognition
coder,” Digital Communications and Networks, vol. 7, no. 3, pp. 453– systems,” in ICASSP 2020-2020 IEEE International Conference on
460, 2021. Acoustics, Speech and Signal Processing (ICASSP), pp. 1738–1742,
[151] Y. Sharon, D. Berend, Y. Liu, A. Shabtai, and Y. Elovici, “Tantra: IEEE, 2020.
timing-based adversarial network traffic reshaping attack,” arXiv [175] Y. Gong, B. Li, C. Poellabauer, and Y. Shi, “Real-time adversarial
preprint arXiv:2103.06297, 2021. attacks,” arXiv preprint arXiv:1905.13399, 2019.
[152] J. Uesato, B. O’Donoghue, A. v. d. Oord, and P. Kohli, “Adversarial [176] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov,
risk and the dangers of evaluating against weak attacks,” arXiv “How to backdoor federated learning,” in International Conference
preprint arXiv:1802.05666, 2018. on Artificial Intelligence and Statistics, pp. 2938–2948, 2020.
[153] F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense [177] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning:
against adversarial attacks using high-level representation guided Concept and applications,” ACM Transactions on Intelligent Systems
denoiser,” in Proceedings of the IEEE Conference on Computer and Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.
Vision and Pattern Recognition, pp. 1778–1787, 2018. [178] G. Aceto, D. Ciuonzo, A. Montieri, and A. Pescapé, “Mobile
[154] A. S. Suggala, A. Prasad, V. Nagarajan, and P. Ravikumar, “Re- encrypted traffic classification using deep learning,” in 2018 Network
visiting adversarial risk,” in The 22nd International Conference on Traffic Measurement and Analysis Conference (TMA), pp. 1–8, IEEE,
Artificial Intelligence and Statistics, pp. 2331–2339, 2019. 2018.
[155] A. Fawzi, H. Fawzi, and O. Fawzi, “Adversarial vulnerability for any [179] E. Hesamifard, H. Takabi, and M. Ghasemi, “Cryptodl: Deep neural
classifier,” in Advances in Neural Information Processing Systems, networks over encrypted data,” arXiv preprint arXiv:1711.05189,
pp. 1178–1187, 2018. 2017.
[156] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry, [180] M. Lotfollahi, M. J. Siavoshani, R. S. H. Zade, and M. Saberian,
“There is no free lunch in adversarial robustness (but there are “Deep packet: A novel approach for encrypted traffic classification
unexpected benefits),” arXiv preprint arXiv:1805.12152, vol. 2, no. 3, using deep learning,” Soft Computing, vol. 24, no. 3, pp. 1999–2012,
2018. 2020.
[157] W. Brendel, J. Rauber, and M. Bethge, “Decision-based adversarial [181] E. Klein, R. Mislovaty, I. Kanter, A. Ruttor, and W. Kinzel, “Synchro-
attacks: Reliable attacks against black-box machine learning models,” nization of neural networks by mutual learning and its application
arXiv preprint arXiv:1712.04248, 2017. to cryptography,” in Advances in Neural Information Processing
[158] C. Guo, J. S. Frank, and K. Q. Weinberger, “Low frequency adver- Systems, pp. 689–696, 2005.
sarial perturbation,” arXiv preprint arXiv:1809.08758, 2018.
[159] A. Nayebi and S. Ganguli, “Biologically inspired protection of deep
networks from adversarial attacks,” arXiv preprint arXiv:1703.09202,
2017.
[160] Y. Yanagita and M. Yamamura, “Gradient masking is a type of over-
fitting,” International Journal of Machine Learning and Computing,
vol. 8, no. 3, 2018.
[161] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a
neural network,” arXiv preprint arXiv:1503.02531, 2015.
[162] A. Ross and F. Doshi-Velez, “Improving the adversarial robustness
and interpretability of deep neural networks by regularizing their
input gradients,” in Proceedings of the AAAI Conference on Artificial
Intelligence, vol. 32, 2018.
[163] K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel,
“Adversarial perturbations against deep neural networks for malware
classification,” arXiv preprint arXiv:1606.04435, 2016.
[164] C. Xie, J. Wang, Z. Zhang, Z. Ren, and A. Yuille, “Miti-
gating adversarial effects through randomization,” arXiv preprint
arXiv:1711.01991, 2017.
27