TLTK1
TLTK1
to train and test the effectiveness and efficiency of the current state-of-the-art deep learning
models on different malware datasets. We examine eight popular DL approaches on various
datasets. This survey will help researchers develop a general understanding of malware
recognition using deep learning.
Corresponding author:
∗ • It presents the deep learning models for cryptog-
[email protected] (A. Bensaoud); [email protected] (J. rapher ransomware (Section 11).
Kalita); [email protected] (M. Bensaoud)
ORCID (s): • It shows how we know if we can trust the results
1
https://www.nbcnews.com/tech/security/colorado-state- of a DL model using Explainable Artificial Intel-
websites-struggle-russian-hackers-vow-attack-rcna51012 ligence, XAI (Section 12).
2
https://www.nbcnews.com/tech/security/china-hacked-least-
six-us-state-governments-report-says-rcna19255 • It discusses significant challenge for the reliabil-
3
https://attackmap.sonicwall.com/live-attack-map ity and security pozed by adversarial attacks on
deep learning models (Section 13).
The rest of this paper, we discuss avenues for future be present in the first 1024 bytes of the docu-
research and we examine the Efficientnet B0, B1, B2, ments. Some hackers take advantage of this by
B3, B4, B5, B6, and B7 models on malware images putting unrelated data within the first 1024 bytes.
datasets for classification. This is a very simple technique to try to avoid
signature-based detection. PDFs are composed of
2. Mechanics of Malware Attacts objects; each section has specific data within the
document or performs a specific function. Each
The hacker has one goal, which is to get malware object starts with two numbers, followed by the
installed onto a victim’s computer. Because most com- keyword obj, and ends with endobj. There are
puters are protected by some type of firewall, direct many kinds of objects, such as font objects, image
attacks are difficult to impossible to perform. Therefore, objects, and even objects that contain metadata.
attackers attempt to trick the computer into running the
malicious code. The most common way to do this is • There are many keywords that begin with a /
by using documents or executable files. For instance, and describe how the PDF works. Some of the
a hacker may send an email or a phish to the victim keywords related to malicious activity include
with a malicious document attachment or a link to a /OpenAction, or its abbreviation /AA, both of
website where the malicious document is located. Once which indicate an automatic action to be per-
the victim opens the document, embedded exploits or formed when the document is viewed4 . This key-
scripts run and download or extract more malware. This word points to another object that automatically
is the real malware the hacker wants to run on the gets opened or executed when the PDF is opened.
victim’s system and is often something like a back- Malicious PDFs have /OpenAction pointing to
door or ransomware. However, malicious documents some malicious JavaScript, or an object contain-
are usually not the final piece of malware in an attack, ing an export; whenever one opens the docu-
but are one of the compromised vectors used by the ment, the system is automatically compromised.
hacker to get on the system. As an example, below we /JavaScript or /JS keyword indicate the presence
discuss how a PDF document can be used to initiate an of JavaScript code. Malicious PDFs usually con-
attack. tain malicious JavaScript to launch an exploit or
download additional malware. Some objects can
2.1. PDF and Document Files be referred to as /Name instead of their number.
When analyzing PDF, we find three things: Object, Some PDFs have the ability to have files em-
which is the structure of the PDF, Keywords which bedded with keyword /EmbeddedFile, /URL or
control how the PDF works, and Data stored or en- /SubmitForm. /URL is accessed or downloaded
coded within a PDF. when the object is loaded.
4
• Objects are the building blocks of a PDFs. Ev- https://blog.didierstevens.com/programs/pdf-tools/
ery PDF starts with a Header which needs to
2
• PDFs can encode data in multiple ways, which JavaScript, it is difficult for hackers to get their exploit
is very flexible and can store data in a number to work.
of ways. Hackers can encode and hide their data.
For example, names are case sensitive, but can
be fully or partially hex encoded. More precisely,
the # sign followed by two hex characters repre-
sents hex encoded data. Data also can be octal
encoded or represented by their base eight num-
ber. The octal encoded character has a ∖ followed
by three digits between 0 and 7. However, the
hackers can mix hex, octal, and ASCII data all
together, which makes it possible to hide data
such as JavaScript code or URLs.
The names and strings can be encoded, but data
streams can be modified and encoded further using
filters. Filters are algorithms that are applied to the
data to encode or compress within the PDF. There
are multiple filters that can be used in PDFs, such Fig. 3: Malicious JavaScript code
as /ASCiiHexDecode, Hex encoding of characters; /
LZWDecode, LZW compression algorithm; /FlateDecode,
Zlib compression; /ASCii85Decode, ASCII base-85 Most hackers try to hide what their script is doing
representation; and /Crypt, various encryption algo- using obfuscation techniques. Most techniques used to
rithms. For example, in Fig. 2, we have a PDF doc- obfuscate script can be broken down into four different
ument with three objects. Object 1 is a catalog that has categories. How the format of a program is obfuscated
OpenAction and is referring to version 0 of object 2, is shown in Fig. 4; approaches include adding extra
which means as soon as the document is opened, Object lines of code, obfuscating the data, and substituting
2 will be run. Object 2 contains a JavaScript keyword, variable names.
but we do not see any JavaScript code in this object
because the JavaScript keyword refers to another object
which is Object 3. Object 3 is a stream object as indi-
cated by the stream keyword and has been ASCiiHex
encoded and compressed with the Zlib compression
algorithm. However, we have been able to determine
that as soon as the PDF opens, JavaScript will be
executed, and we do not know what the JavaScript’s
goal is. If this is a malicious PDF, it can cause problems.
In Fig. 3, the JavaScript code references the two hosts’
names, performs an HTTP GET request to each, saves
an executable file, and finally runs it. Fig. 4: Obfuscated malicious JavaScript code.
4
Table 1
Syslog and Windows log
of Android apps and applied a deep learning technique. are black and white values in the range [0-255] where
Chaulagain et al. [6] presented a deep learning-based 0 represents black, and 255 represents white.
hybrid analysis technique by collecting different arti-
facts during static and dynamic analysis to train the Gray image feature: The machine stores images
deep learning models. in a matrix of numbers. These numbers, or the pixel
values, denote the intensity or brightness of the pixel.
Smaller numbers (close to zero) represent black, and
5. Data for Malware Detection larger numbers (closer to 255) denote white (see Fig.
Numerous system logs of activities of machines 6).
such as phones, tablets, laptops, and other devices are
generated by the operating system and other infrastruc-
ture software. The data are created and stored on the
local device and sent to remote servers. Analyzing log
data, we can not only detect breaches or suspicious
activity, but we can track behavior through the network.
Log data allow us to track security events, troubleshoot
the infrastructure, and optimize the environment and
the machines. Log data can take many different forms
like syslog, authentication logs, local security event
logs, network asset logs, and system logs. One of goals
in malware detection is to be able to read, search, and
analyze the data efficiently and effectively.
Fig. 6: Malware feature representation in grayscale image
Table 1 contains some information that is useful
from syslog and windows logs. Both kinds of logs have
many components in different format that helps us in
the investigation. RGB images: There are three matrices or channels
(Red, Green, Blue), where each matrix has values be-
tween 0−255. These three colors are combined together
6. Generating Malware Images for Deep in various ways to represent one of 16,777,216 possible
Learning colors (see Fig. 7).
Several tools can visualize and edit a binary file
in hexadecimal or ASCII formats such as IDA Pro7 ,
x32/x64 Debugger8 , HxD9 , PE-bear10 , Yara11 , Fid-
dler12 , Metadata13 , XOR analysis14 , and Embedded
strings15 .
Malware file or code can be used to generate an
image by converting the binary, octal, hexadecimal or
decimal into a two dimensional matrix of pixels. The
image can be grayscale or RGB. In greyscale, pixels
7
https://hex-rays.com/ida-pro
8
https://x64dbg.com/#start
9
https://mh-nexus.de/en/hxd
10
https://hshrzd.wordpress.com/pe-bear
11
https://yara.readthedocs.io/en/stable Fig. 7: Malware feature representation in RGB image
12
https://www.telerik.com/purchase/fiddler
13
https://www.malwarebytes.com/glossary/metadata
14
https://eternal-todo.com/var/scripts/xorbruteforcer Malware can be converted to images in different
15
https://virustotal.github.io/yara/
ways. Yuan et al. [7] converted malware binaries into
5
Markov images by computing transfer probability of accuracy on obfuscated malware detection. Asam et al.
bytes where each pixel is generated by equation 1: [18] proposed two malware image classification ap-
proaches called Deep Feature Space-based Malware
classification (DFS-MC) and Deep Boosted Feature
𝑓 (𝑚, 𝑛)
𝑝𝑚,𝑛 = 𝑃 (𝑛|𝑚) = 𝑚, 𝑛 ∈ {0, 1, ..., 255}. Space-based Malware classification (DBFS-MC). The
∑
255
approach achieved a good accuracy of 98.61% on the
𝑓 (𝑚, 𝑛) MalImg malware dataset.
𝑛=0
Xiao et al. [19] presented a visualization method
(1)
called Colored Label boxes (CoLab) to specify each
Mohammed et al. [8] used a vector of 16-bit signed section in a PE file and convert it to malware image.
hexadecimal numbers to represent a 256 × 256 image. The authors built a composed CoLab image,cand used
Then, they computed bi-gram frequency counts which VGG16, and Support vector machine for classification.
they used as pixel intensity values. Full-frame Discrete The model was applied on two datasets, VX-Heaven16
Cosine Transform (DCT) [9] was computed to de- and BIG-2015, with 96.59% and 98.94% average accu-
sparsify, and the bigram-DCT was used to represent racies, respectively. A comparison of reviewed malware
the output image. Euh et al. [10] proposed Window images classification is discussed in Table 2.
Entropy Map (WEM) to visualize malware as an image.
They calculated the entropy for each byte to measure the
8. Feature Reduction for Efficient
degree of uncertainty. Ni et al. [11] converted malware
code into gray images using SimHash [12] and then Malware Detection
encoded them. They mapped SimHash values to pixels Feature Reduction reduces the number of variables
and then converted them to grayscale images. or features in the representation of a data example.
Approaches to feature reduction can be divided into
two subcategories called a) Feature Selection which in-
7. Image Classification for Malware cludes methods such as Wrappers, Filters, and Embed-
Detection ded, and b) Feature Extraction, which includes methods
Deep learning can solve diverse "vision" problems, such as Principal Components Analysis [26]. How does
including malware image classification tasks. Deep Feature Reduction improve performance? It does by
learning can extract features automatically obviating reducing the number of features that are considered for
manual feature extraction. The content of the malware analysis.
executable file is first converted into a digital image. In feature extraction, we start with 𝑛 features 𝑥1 , 𝑥2 , 𝑥3
Nataraj et al. [13] visualized the byte codes of samples , ...., 𝑥𝑛 , which we map to a lower dimensional space
from 25 malware families as grayscale images. Several to get the new features 𝑧1 , 𝑧2 , 𝑧3 , ...., 𝑧𝑚 where 𝑚 < 𝑛.
visualization techniques have been used for malware Each of the new features is usually linear a combination
classification. The basic idea used in these methods is to of the original feature set 𝑥1 , 𝑥2 , 𝑥3 , ...., 𝑥𝑛 . Thus, each
explore the distinguishing patterns in malware images. new feature is obtained as a function F(X) of the
In addition, the visualization techniques help find the original feature set X. This makes a projection of a
correlations among different malware families. Some higher dimensional feature space to a lower dimen-
existing approaches generate grayscale images and sional feature space, so that the smaller dimensional
others generate RGB images. Most existing approaches feature set may lead to better classification or faster
use global features to generate malware image. classification (see equation 2).
Yuan et al. [7] proposed a method based on Markov
images according to the byte transmission probability [ ]⊺ ([ ]⊺ )
matrix. They used a CNN to classify Markov malware 𝑧1 … 𝑧𝑚 =𝐹 𝑥1 … 𝑥𝑛 (2)
images without scaling. Narayanan and Davuluru [14]
proposed an ensemble approach using RNN and CNN In feature selection, we choose a subset of the
architectures for malware image classification. Images features, in contrast to feature extraction where we map
were generated from assembly compiled files and clas- the original features to a lower dimensional space. The
sified using CNNs. Zhu et al. [15] proposed a Task- smaller dimensional feature set can help produce better
Aware Meta Learning-based Siamese Neural Network as well as faster classification. To do that, we need
to classify obfuscated malware images. Their model to find a projection matrix 𝑊 ∋ 𝑍̄ = 𝑊 𝑇 𝑋. ̄ We
showed high effectiveness on unique malware signa- expect from such a projection that the new features
ture detection to classify obfuscated malware. Chauhan are uncorrelated and cannot be reduced further and are
et al. [16] visualized malware files in different color non redundant. Next, we need features to have large
modes, RGB, HSV, greyscale, and BGR. They used a variance: Why? Because if a feature takes similar values
support vector machine (SVM) to classify these mal- for all the instances, that feature cannot be used as a
ware images, with accuracy of 96% in all modes. Darem discriminator.
et al. [17] designed a semi-supervised method based
on malware image and feature engineering for obfus- 16
https://archive.org/download/vxheavens-2010-05-18
cated malware detection. The model achieved 99.12%
6
Table 2
Comparative performance summary of Transfer Learning models for malware image classification.
Feature extraction methods such as a Principal learning algorithm on large datasets can be done in two
Component Analysis (PCA) [26], GIST [27], Hu Mo- ways, as discussed below.
ments [28], Color Histogram [29], Haralick texture
[30], Discrete Wavelet Transform (DWT) [31], In- 9.1. Using feature extraction
dependent Component Analysis (ICA) [32], Linear Feature extraction discussed earlier is a practical
discriminant analysis (LDA) [33], Oriented Fast and and common, and low resource-intensive way of using
Rotated BRIEF (ORB) [34], Speeded Up Robust Fea- pre-trained networks. It takes the convolutional base of
ture (SURF) [35], Scale Invariant Feature Transform a previously trained network and runs the malware data
(SIFT) [36], Dense Scale Invariant Feature Transform through it, and then trains a new classifier on top of the
(D-SIFT) [36], Local Binary Patterns (LBPs) [37], output. As shown in Fig. 8, we can choose a network
KAZE [38] have been combined with machine learning such as VGG16 [47] that has been trained on ImageNet,
including deep learning. These methods successfully as an example. The input fed at the bottom, goes up to
filter the characteristics of malware files. the trained convolutional base, representing the CNN
region of the VGG16. The trained classifier resides in
Azad et al. [39] proposed a method named DEEPSEL the dense region and the prediction is made by this
(Deep Feature Selection) to identify malicious codes dense region at the end. Usually, we have 1000 neurons
of 39 unique malware families. Their model achieved at the end to predict the actual ImageNet classes. We
an accuracy of 83.6% and an F-measure of 82.5%. take this ImageNet trained model as base, and remove
Tobiyama et al. [40] proposed feature extraction based the classifier layer, keeping the convolutional layers of
on system calls. Recurrent Neural Network was used to the pre-trained model, along with their weights. In the
extract features and Convolutional Neural Network to next step, we attach a new classifier that has new dense
classify these features. layers for malware classification on top. The weights
of the base are frozen, which means that the malware
input passes through convolutional layers which have
9. Deep Transfer Leaning models for their prior weights, during training. However, all dense
layers are randomly initialized, and the interconnection
Malware detection weights for these layers are learned during the new
Transfer learning takes place if we have a source training process for detecting malware.
model which has some pre-trained knowledge and this Why remove the original dense layers? What has
knowledge is needed as the foundation to build a new been observed is that the representations learned by the
model [41]. For example, using a very large pre-trained convolutional base are generic and therefore reusable
convolutional neural network usually involves saving for a variety of tasks.
a network that was previously trained on some large
dataset, typically on a large-scale image classification 9.2. Using fine tuning
task, using a dataset like ImageNet [42]. After training Fine-tuning involves changing some of the convo-
a network on the ImageNet dataset, we can re-purpose lutional layers by learning new weights. In Fig. 9,
this trained network. Research papers have discussed we have a network divided into three regions. The
applying these pre-trained networks to malware image yellow region is a pre-trained model. The green region
datasets [43, 44, 45, 46] that are generated form PE and represents our dense layers for which we need to learn
APK malware files, which are quite different from each the weights. During training using a library such as
other. Keras [48] and Tensorflow [49], we can select certain
Malware image datasets are very different from Im- layers and freeze the weights of those layers.
ageNet, which is normally used to pre-train the model. For example, we can select convolutional block one
The ImageNet dataset and a malware image dataset and then freeze all the weights of the convolutional
represent visually completely different images. How- layers, in this block only. This means that during train-
ever, pre-trained still seems to help. Training a machine ing, everything else will change, but the weights of
7
Fig. 9: Fine Tuning of Transfer Learning
need to put an intermediate layer between the net- attack characteristics of unlabeled ransomware sam-
work and ciphertext that computes the features, such ples using a deep learning-based unsupervised learned
as Unigram frequencies, Bigram frequencies, Index of model. Fischer et al. [100] designed a tool to detect se-
Coincidence IoC, HasDoubleLetters, etc., and then we curity vulnerabilities of cryptographic APIs in Android
can train the network with millions of ciphertext and by achieving an average AUC-ROC of 99.2%.
all American Cryptogram Association (ACA) cipher
types. For example, in Fig. 18, the three blue neural
networks are given the frequencies of N-grams (1- 12. Explainable Artificial Intelligence
grams, 2-grams, 3-grams, 4-grams, etc.), and the green (XAI)
neural network computes HasDoubleLetters. Then we Explainable Artificial Intelligence (XAI) is a rapidly
have a hidden layer that connects the input and output emerging field that focuses on creating transparent and
layers. Finally, in this case the designed neural network interpretable models (see Fig. 19). In the context of
shows 90% Seriated Playfair, and the green neural malware detection, XAI can help security experts and
network shows 10% Bazeries. Baksi [95] designed a analysts understand how a machine learning model
machine-learning model for differential attacks on the arrived at its decisions, making it easier to identify
non-Markov 8-round GIMLI cipher and GIMLI hash. and understand false positives and false negatives. By
They applied multi-layer perceptron (MLP), Convolu- applying XAI techniques, such as Local Interpretable
tional Neural Networks (CNN), and Long Short-Term Model-Agnostic Explanations (LIME) [101] or Deep
Memory (LSTM). Learning Important Features (DeepLIFT) [102], secu-
The ransomware families to encrypt data and force rity teams can gain insights into the most important
the victim to make payment via cryptocurrency include features and decision-making processes of the model.
WannaCry, Locky, Stop, CryptoJoker, CrypoWall, Tes- This can help them identify areas where the model
laCrypt, Dharma, Locker, Cerber, and GandCrab. Re- may be vulnerable to evasion or identify new malware
cently, deep learning algorithms have been used for strains that the model may have missed. Ultimately,
cryptography [96]. Ding et al. [97] proposed DeepEDN XAI can improve the trustworthiness and reliability
to fulfill the process of encrypting and decrypting med- of machine learning models for malware detection,
ical images. Kim et al. [98] proposed detection of cryp- enabling more effective threat detection and response.
tographic ransomware using Convolutional Neural Net- Nadeem et al. [103] provided a comprehensive sur-
work. Their model prevents crypto-ransomware infec- vey and analysis of the current state of research on ex-
tion by detecting a block cipher algorithm. Sharmeen plainable machine learning (XAI) techniques for com-
et al. [99] proposed an approach to extract the intrinsic puter security applications. The paper highlights the
13
Fig. 18: Detect the Cipher Type With Neural Networks
challenges and opportunities for adopting XAI in the Ablation (FA), and Local Interpretable Model-Agnostic
security domain and discusses several approaches for Explanations (LIME).
designing and evaluating explainable machine learning Guo et al. [110] proposed an approach called Ex-
models. Vivek et al. [104] proposed an approach for plaining Deep Learning based Security Applications
detecting ATM fraud using explainable artificial intelli- (LEMNA) for security applications, which generates
gence (XAI) and causal inference techniques. They pre- interpretable features to explain how input samples are
sented a detailed analysis of the proposed method and classified. Kuppa and Le-Khac [111] presented a com-
highlighted its effectiveness in improving the accuracy prehensive analysis of the vulnerability of XAI methods
and interpretability of ATM fraud detection systems. to adversarial attacks in the context of cybersecurity,
Kinkead et al. [105] proposed an approach that uses discussing potential risks associated with deploying
LIME to identify important locations in the opcode XAI models in real-world applications, and proposing
sequence that are deemed significant by the Convolu- a framework for designing robust and secure XAI sys-
tional Neural Network (CNN). McLaughlin et al. [106] tems. Rao and Mane [112] proposed an approach to
used LRP [107] and DeepLift [102] methods to iden- protect and analyze systems against the alarm-flooding
tify the opcode sequences for most malware families, problem using the NSL-KDD dataset. They included a
and they demonstrated that the CNN, while using the Security Information and Event Management (SIEM)
DAMD dataset, learned patterns from the underlying system to generate a zero-shot method for detecting
op-code representation. Hooker et al. [108] proposed a alarm labels specific to adversarial attacks. Although
method to remove relevant features detected by an XAI explainable artificial intelligence (XAI) has gained sig-
approach and verify the accuracy degradation. Lin et al. nificant attention, its effectiveness in malware detection
[109] presented seven different XAI methods and auto- still requires further investigation to fully comprehend
mated the evaluation of the correctness of explanation its performance.
techniques. The first four XAI methods are white-box
approaches to determine the importance of input fea-
tures: Backpropagation (BP), Guided Backpropagation 13. Adversarial Attack on Deep Neural
(GBP), Gradient-weighted Class Activation Mapping Networks
(GCAM), and Guided GCAM (GGCAM). The last Adversarial examples refer to maliciously crafted
three are black-box approaches that observe an essential inputs to machine learning models designed to deceive
feature in the output probability using perturbed sam- the model into making incorrect predictions. Deep de-
ples of the input: Occlusion Sensitivity (OCC), Feature tection in this context refers to the use of deep learning
14
models for detecting and classifying objects or pat- combines random forests and LIME to identify the most
terns in the input data. Adversarial examples can be important features and thus improve the interpretability
specifically crafted to evade deep detection models and and robustness of the model. Meenakshi and Mara-
cause them to misclassify or miss the target objects or gatham [119] proposed a defensive technique using
patterns. Therefore, adversarial examples can be seen Curvelet transform to recognize adversarial iris images,
as a type of attack on deep detection models. Adver- optimizing the image classification accuracy. The de-
sarial examples can be generated using a variety of signed method was shown to be effective against several
techniques, including optimization-based approaches existing adversarial attacks on iris recognition systems.
and perturbation-based approaches, and can be used Pintor et al. [120] introduced a method for debugging
for various objectives, including evasion attacks and and improving the optimization of adversarial examples
poisoning attacks. Zhong et al. [113] proposed a novel by identifying and analyzing the indicators of attack
adversarial malware example generation method called failure. The proposed method can help to improve the
Malfox, which uses conditional generative adversarial robustness of deep learning models against adversarial
networks (conv-GANs) to generate camouflaged ad- attacks.
versarial examples against black-box detectors. The
presented method was evaluated on two real-world
14. Conclusion
malware detection systems, and the results showed that
Malfox achieved high attack success rates while main- Machine learning has started to gain the attention
taining low detection rates. Zhao et al. [114] proposed of malware detection researchers, notably in malware
a new method called SAGE for steering the adversarial image classification and cipher cryptanalysis. However,
generation of examples with accelerations. The tech- more experimentation is required to understand the
nique combines the advantages of gradient-based and capabilities and limitations of deep learning when used
gradient-free methods to generate more effective and to detect/classify malware. Deep learning can reduce
efficient adversarial examples. the need for static and dynamic analysis and discover
The development of defense mechanisms against suspicious patterns. In the future, researchers may con-
adversarial attacks is a computationally expensive pro- sider developing more accurate, robust, scalable, and
cess, which can potentially affect the performance of efficient deep learning models for malware detection
the deep learning model. In addition, adversarial ex- systems for various operating systems. Finally, multi-
amples can impact the generalization ability of deep task learning and transfer learning can provide valu-
learning models, resulting in poor performance on new able results in classifying all types of malware. Fur-
and unseen data. Moreover, generating adversarial ex- thermore, we show that the significant challenges of
amples can be computationally intensive, especially for deep learning approaches that need to be considered
large datasets and complex models, which can hinder are hyperparameters optimization, fine-tuning, and size
the practical deployment of deep learning models in and quality of datasets when features are overweighted
real-world applications. Thus, further research is re- or overrepresented. We also illustrate the opportunities
quired to improve the efficiency and effectiveness of de- and challenges of XAI in deep learning as well as future
fense mechanisms, as well as the generalization ability research directions in the context of malware detection.
and robustness of deep learning models to adversarial Finally, we presented the idea of adversarial attacks on
attacks. deep neural networks by introducing small, carefully
Hu and Tan [115] proposed a method to generate crafted perturbations to input data in order to cause
adversarial malware examples using Generative Adver- misclassification or reduce model performance.
sarial Networks (GANs) for black-box attacks. Their
results show that the generated adversarial malware
samples can evade detection by existing machine learn- References
ing models while maintaining high similarity to the [1] A. Damodaran, F. Di Troia, C. A. Visaggio, T. H. Austin,
original malware. Ling et al. [116] conducted a survey M. Stamp, A comparison of static, dynamic, and hybrid
of the state-of-the-art in adversarial attacks against analysis for malware detection, Journal of Computer Virology
and Hacking Techniques 13 (2017) 1–12.
Windows PE malware detection, covering various types [2] N. Naik, P. Jenkins, N. Savage, L. Yang, T. Boongoen, N. Iam-
of attacks and defense mechanisms. The authors also On, Fuzzy-import hashing: A static analysis technique for
provided insights on potential future research directions malware detection, Forensic Science International: Digital
in this area. Xu et al. [117] proposed a semi-black-box Investigation 37 (2021) 301139.
adversarial sample attack framework called Ofei that [3] Mohamad, J. Arif, M. F. Ab Razak, S. Awang, S. R. Tuan Mat,
N. S. N. Ismail, A. Firdaus, A static analysis approach for
can generate adversarial samples against Android apps android permission-based malware detection systems, PloS
deployed on a DLAAS platform. The framework uti- one 16 (2021) e0257968.
lizes a multi-objective optimization algorithm to gener- [4] T. Kim, S. C. Suh, H. Kim, J. Kim, J. Kim, An encoding
ate robust and stealthy adversarial samples. Qiao et al. technique for cnn-based network anomaly detection, in: 2018
[118] proposed an adversarial detection method for IEEE International Conference on Big Data (Big Data), 2018,
pp. 2960–2965. doi:10.1109/BigData.2018.8622568.
ELF malware using model interpretation and show that [5] Y. Bai, Z. Xing, D. Ma, X. Li, Z. Feng, Comparative analysis
their method can effectively identify adversarial ELF of feature representations and machine learning methods in
malware with high accuracy. The proposed approach
15
android family classification, Computer Networks 184 (2021) [25] W. W. Lo, X. Yang, Y. Wang, An xception convolutional
107639. neural network for malware classification with transfer learn-
[6] D. Chaulagain, P. Poudel, P. Pathak, S. Roy, D. Caragea, ing, in: 2019 10th IFIP International Conference on New
G. Liu, X. Ou, Hybrid analysis of android apps for security Technologies, Mobility and Security (NTMS), 2019, pp. 1–5.
vetting using deep learning, in: 2020 IEEE Conference on doi:10.1109/NTMS.2019.8763852.
Communications and Network Security (CNS), 2020, pp. 1–9. [26] N. Barath, D. Ouboti, M. Temesguen, Pattern recognition
doi:10.1109/CNS48642.2020.9162341. algorithms for malware classification, in: proceeding of 2016
[7] B. Yuan, J. Wang, D. Liu, W. Guo, P. Wu, X. Bao, Byte- IEEE Conference of Aerospace and Electronics, 2016, pp.
level malware classification based on markov images and deep 338–342.
learning, Computers & Security 92 (2020) 101740. [27] A. Oliva, A. Torralba, Modeling the shape of the scene: A
[8] T. M. Mohammed, L. Nataraj, S. Chikkagoudar, S. Chan- holistic representation of the spatial envelope, International
drasekaran, B. Manjunath, Malware detection using frequency journal of computer vision 42 (2001) 145–175.
domain-based image visualization and deep learning, arXiv [28] M.-K. Hu, Visual pattern recognition by moment invariants,
preprint arXiv:2101.10578 (2021). IRE transactions on information theory 8 (1962) 179–187.
[9] S. A. Khayam, The discrete cosine transform (dct): theory and [29] M. J. Swain, D. H. Ballard, Color indexing, International
application, Michigan State University 114 (2003) 1–31. journal of computer vision 7 (1991) 11–32.
[10] S. Euh, H. Lee, D. Kim, D. Hwang, Comparative analysis [30] W.-C. Lin, J. Hays, C. Wu, V. Kwatra, Y. Liu, A comparison
of low-dimensional features and tree-based ensembles for study of four texture synthesis algorithms on regular and near-
malware detection systems, IEEE Access 8 (2020) 76796– regular textures, Tech. Rep. (2004).
76808. [31] K. Kancherla, J. Donahue, S. Mukkamala, Packer identifica-
[11] S. Ni, Q. Qian, R. Zhang, Malware identification using tion using byte plot and markov plot, Journal of Computer
visualization images and deep learning, Computers & Security Virology and Hacking Techniques 12 (2016) 101–111.
77 (2018) 871–885. [32] J. Herault, C. Jutten, Space or time adaptive signal processing
[12] M. S. Charikar, Similarity estimation techniques from round- by neural network models, in: AIP conference proceedings,
ing algorithms, in: Proceedings of the thiry-fourth annual American Institute of Physics, 1986, pp. 206–211.
ACM symposium on Theory of computing, 2002, pp. 380– [33] Z. Fan, Y. Xu, D. Zhang, Local linear discriminant analysis
388. framework using sample neighbors, IEEE Transactions on
[13] L. Nataraj, S. Karthikeyan, G. Jacob, B. S. Manjunath, Mal- Neural Networks 22 (2011) 1119–1132.
ware images: visualization and automatic classification, in: [34] E. Rublee, V. Rabaud, K. Konolige, G. Bradski, Orb: An
Proceedings of the 8th international symposium on visualiza- efficient alternative to sift or surf, in: 2011 International Con-
tion for cyber security, 2011, pp. 1–7. ference on Computer Vision, 2011, pp. 2564–2571. doi:10.
[14] B. N. Narayanan, V. S. P. Davuluru, Ensemble malware 1109/ICCV.2011.6126544.
classification system using deep neural networks, Electronics [35] H. Bay, T. Tuytelaars, L. Van Gool, Surf: Speeded up
9 (2020) 721. robust features, in: European conference on computer vision,
[15] J. Zhu, J. Jang-Jaccard, A. Singh, P. A. Watters, S. Camtepe, Springer, 2006, pp. 404–417.
Task-aware meta learning-based siamese neural network [36] D. G. Lowe, Object recognition from local scale-invariant
for classifying obfuscated malware, arXiv preprint features, in: Proceedings of the seventh IEEE international
arXiv:2110.13409 (2021). conference on computer vision, volume 2, Ieee, 1999, pp.
[16] D. Chauhan, H. Singh, H. Hooda, R. Gupta, Classification 1150–1157.
of malware using visualization techniques, in: International [37] T. Ojala, M. Pietikäinen, D. Harwood, A comparative study
Conference on Innovative Computing and Communications, of texture measures with classification based on featured dis-
Springer, 2022, pp. 739–750. tributions, Pattern recognition 29 (1996) 51–59.
[17] A. Darem, J. Abawajy, A. Makkar, A. Alhashmi, S. Alanazi, [38] P. F. Alcantarilla, A. Bartoli, A. J. Davison, Kaze features, in:
Visualization and deep-learning-based malware variant detec- European conference on computer vision, Springer, 2012, pp.
tion using opcode-level features, Future Generation Computer 214–227.
Systems 125 (2021) 314–323. [39] M. A. Azad, F. Riaz, A. Aftab, S. K. J. Rizvi, J. Arshad,
[18] M. Asam, S. J. Hussain, M. Mohatram, S. H. Khan, T. Jamal, H. F. Atlam, Deepsel: A novel feature selection for early
A. Zafar, A. Khan, M. U. Ali, U. Zahoora, Detection of ex- identification of malware in mobile applications, Future
ceptional malware variants using deep boosted feature spaces Generation Computer Systems 129 (2022) 54–63.
and machine learning, Applied Sciences 11 (2021) 10464. [40] S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse, T. Yagi,
[19] M. Xiao, C. Guo, G. Shen, Y. Cui, C. Jiang, Image-based Malware detection with deep neural network using process
malware classification using section distribution information, behavior, in: 2016 IEEE 40th Annual Computer Software and
Computers & Security 110 (2021) 102420. Applications Conference (COMPSAC), volume 2, 2016, pp.
[20] A. Çayır, U. Ünal, H. Dağ, Random capsnet forest model for 577–582. doi:10.1109/COMPSAC.2016.151.
imbalanced malware type classification task, Computers & [41] R. Ye, Q. Dai, Implementing transfer learning across different
Security 102 (2021) 102133. datasets for time series forecasting, Pattern Recognition 109
[21] J. H. Go, T. Jan, M. Mohanty, O. P. Patel, D. Puthal, (2021) 107617.
M. Prasad, Visualization approach for malware classification [42] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,
with resnext, in: 2020 IEEE Congress on Evolutionary Com- Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al., Im-
putation (CEC), 2020, pp. 1–7. doi:10.1109/CEC48606.2020. agenet large scale visual recognition challenge, International
9185490. journal of computer vision 115 (2015) 211–252.
[22] A. Bensaoud, N. Abudawaood, J. Kalita, Classifying malware [43] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei,
images with convolutional neural network models, Interna- Q. Zheng, Imcfn: Image-based malware classification using
tional Journal of Network Security 22 (2020) 1022–1031. fine-tuned convolutional neural network architecture, Com-
[23] W. El-Shafai, I. Almomani, A. AlKhayer, Visualized mal- puter Networks 171 (2020) 107138.
ware multi-classification framework using fine-tuned cnn- [44] E. Rezende, G. Ruppert, T. Carvalho, F. Ramos, P. De Geus,
based transfer learning models, Applied Sciences 11 (2021). Malicious software classification using transfer learning of
[24] J. Hemalatha, S. A. Roseline, S. Geetha, S. Kadry, R. Damaše- resnet-50 deep neural network, in: 2017 16th IEEE Inter-
vičius, An efficient densenet-based deep learning model for national Conference on Machine Learning and Applications
malware detection, Entropy 23 (2021) 344. (ICMLA), IEEE, 2017, pp. 1011–1014.
16
[45] N. Bhodia, P. Prajapati, F. Di Troia, M. Stamp, Transfer learn- [64] S. Eum, H. Lee, H. Kwon, Going deeper with cnn in
ing for image-based malware classification, arXiv preprint malicious crowd event classification, in: Signal Processing,
arXiv:1903.11551 (2019). Sensor/Information Fusion, and Target Recognition XXVII,
[46] Y. Qiao, B. Zhang, W. Zhang, Malware classification method volume 10646, International Society for Optics and Photonics,
based on word vector of bytes and multilayer perception, in: 2018, p. 1064616.
ICC 2020-2020 IEEE International Conference on Communi- [65] D. Jurafsky, J. H. Martin, Speech and Language Processing,
cations (ICC), IEEE, 2020, pp. 1–6. 3rd ed., Prentice Hall, 2021.
[47] K. Simonyan, A. Zisserman, Very deep convolutional net- [66] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation
works for large-scale image recognition, arXiv preprint of word representations in vector space, arXiv preprint
arXiv:1409.1556 (2014). arXiv:1301.3781 (2013).
[48] N. Ketkar, E. Santana, Deep learning with Python, volume 1, [67] M. Pagliardini, P. Gupta, M. Jaggi, Unsupervised learning
Springer, 2017. of sentence embeddings using compositional n-gram features,
[49] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, arXiv preprint arXiv:1703.02507 (2017).
M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Ten- [68] A. Bensaoud, J. Kalita, Cnn-lstm and transfer learning models
sorflow: A system for large-scale machine learning, in: for malware classification based on opcodes and api calls,
12th {USENIX} symposium on operating systems design and Knowledge-Based Systems (2024) 111543.
implementation ({OSDI} 16), 2016, pp. 265–283. [69] M. Mimura, R. Ito, Applying nlp techniques to malware
[50] Sudhakar, S. Kumar, Mcft-cnn: Malware classification with detection in a practical environment, International Journal of
fine-tune convolution neural networks using traditional and Information Security (2021) 1–13.
transfer learning in internet of things, Future Generation [70] M.-T. Luong, H. Pham, C. D. Manning, Effective approaches
Computer Systems 125 (2021) 334–351. to attention-based neural machine translation, arXiv preprint
[51] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for arXiv:1508.04025 (2015).
image recognition, in: Proceedings of the IEEE conference on [71] Z. Lu, X. Li, Y. Liu, C. Zhou, J. Cui, B. Wang, M. Zhang,
computer vision and pattern recognition, 2016, pp. 770–778. J. Su, Exploring multi-stage information interactions for
[52] J. H. Go, T. Jan, M. Mohanty, O. P. Patel, D. Puthal, multi-source neural machine translation, IEEE/ACM Trans-
M. Prasad, Visualization approach for malware classification actions on Audio, Speech, and Language Processing (2021)
with resnext, in: 2020 IEEE Congress on Evolutionary Com- 1–1.
putation (CEC), 2020, pp. 1–7. doi:10.1109/CEC48606.2020. [72] Y. Gao, H. Gong, X. Ding, B. Guo, Image recognition based
9185490. on mixed attention mechanism in smart home appliances, in:
[53] U. Von Luxburg, I. Guyon, S. Bengio, H. Wallach, R. Fergus, 2021 IEEE 5th Advanced Information Technology, Electronic
S. Vishwanathan, R. Garnett, Advances in neural information and Automation Control Conference (IAEAC), volume 5,
processing systems 30, in: 31st annual conference on neural 2021, pp. 1501–1505. doi:10.1109/IAEAC50856.2021.9391092.
information processing systems (NIPS 2017), Long Beach, [73] R. Z. AlMazrouei, J. Nelci, S. A. Salloum, K. Shaalan, Feasi-
California, USA, 2017, pp. 4–9. bility of using attention mechanism in abstractive summariza-
[54] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, tion, in: International Conference on Emerging Technologies
Rethinking the inception architecture for computer vision, in: and Intelligent Systems, Springer, 2021, pp. 13–20.
Proceedings of the IEEE conference on computer vision and [74] Z. Niu, G. Zhong, H. Yu, A review on the attention mechanism
pattern recognition, 2016, pp. 2818–2826. of deep learning, Neurocomputing 452 (2021) 48–62.
[55] R. U. Khan, X. Zhang, R. Kumar, Analysis of resnet and [75] S. Ren, L. Zhou, S. Liu, F. Wei, M. Zhou, S. Ma, Semface:
googlenet models for malware detection, Journal of Computer Pre-training encoder and decoder with a semantic interface for
Virology and Hacking Techniques 15 (2019) 29–37. neural machine translation, in: Proceedings of the 59th Annual
[56] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, Meeting of the Association for Computational Linguistics and
D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with the 11th International Joint Conference on Natural Language
convolutions, in: Proceedings of the IEEE conference on Processing (Volume 1: Long Papers), 2021, pp. 4518–4527.
computer vision and pattern recognition, 2015, pp. 1–9. [76] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation
[57] M. Tan, Q. Le, Efficientnet: Rethinking model scaling for by jointly learning to align and translate, arXiv preprint
convolutional neural networks, in: International Conference arXiv:1409.0473 (2014).
on Machine Learning, PMLR, 2019, pp. 6105–6114. [77] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones,
[58] C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception- A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you
v4, inception-resnet and the impact of residual connections on need, in: Advances in neural information processing systems,
learning, Proceedings of the AAAI Conference on Artificial 2017, pp. 5998–6008.
Intelligence 31 (2017). [78] O. Or-Meir, A. Cohen, Y. Elovici, L. Rokach, N. Nissim,
[59] F. Chollet, Xception: Deep learning with depthwise separable Pay attention: Improving classification of pe malware using
convolutions, in: Proceedings of the IEEE conference on attention mechanisms based on system call analysis, in: 2021
computer vision and pattern recognition, 2017, pp. 1251– International Joint Conference on Neural Networks (IJCNN),
1258. 2021, pp. 1–8. doi:10.1109/IJCNN52387.2021.9533481.
[60] S. Sabour, N. Frosst, G. E. Hinton, Dynamic routing between [79] H. Yakura, S. Shinozaki, R. Nishimura, Y. Oyama, J. Sakuma,
capsules, arXiv preprint arXiv:1710.09829 (2017). Neural malware analysis with attention mechanism, Comput-
[61] D. Gibert, C. Mateu, J. Planes, The rise of machine learning ers & Security 87 (2019) 101592.
for detection and classification of malware: Research devel- [80] M. Mimura, T. Ohminami, Using lsi to detect unknown
opments, trends and challenges, Journal of Network and malicious vba macros, Journal of Information Processing 28
Computer Applications 153 (2020) 102526. (2020) 493–501.
[62] V. Kocaman, O. M. Shir, T. Bäck, Improving model accuracy [81] X. Ma, S. Guo, H. Li, Z. Pan, J. Qiu, Y. Ding, F. Chen,
for imbalanced image classification tasks by adding a final How to make attention mechanisms more practical in malware
batch normalization layer: An empirical study, in: 2020 classification, IEEE Access 7 (2019) 155270–155280.
25th International Conference on Pattern Recognition (ICPR), [82] Girinoto, H. Setiawan, P. A. W. Putro, Y. R. Pramadi, Com-
2021, pp. 10404–10411. doi:10.1109/ICPR48806.2021.9412907. parison of lstm architecture for malware classification, in:
[63] S. Alaraimi, K. E. Okedu, H. Tianfield, R. Holden, O. Uth- 2020 International Conference on Informatics, Multimedia,
mani, Transfer learning networks with skip connections Cyber and Information System (ICIMCIS), 2020, pp. 93–97.
for classification of brain tumors, International Journal of doi:10.1109/ICIMCIS51567.2020.9354301.
Imaging Systems and Technology (2021).
17
[83] R. Agrawal, J. W. Stokes, K. Selvaraj, M. Marinescu, Atten- the 22nd ACM SIGKDD international conference on knowl-
tion in recurrent neural networks for ransomware detection, edge discovery and data mining, 2016, pp. 1135–1144.
in: ICASSP 2019 - 2019 IEEE International Conference on [102] A. Shrikumar, P. Greenside, A. Kundaje, Learning important
Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. features through propagating activation differences, in: Inter-
3222–3226. doi:10.1109/ICASSP.2019.8682899. national conference on machine learning, PMLR, 2017, pp.
[84] J. Kim, Y. Ban, E. Ko, H. Cho, J. H. Yi, Mapas: a practical 3145–3153.
deep learning-based android malware detection system, Inter- [103] A. Nadeem, D. Vos, C. Cao, L. Pajola, S. Dieck, R. Baumgart-
national Journal of Information Security (2022) 1–14. ner, S. Verwer, Sok: Explainable machine learning for com-
[85] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, puter security applications, arXiv preprint arXiv:2208.10605
G. Ross, G. Stringhini, Mamadroid: Detecting android mal- (2022).
ware by building markov chains of behavioral models (ex- [104] Y. Vivek, V. Ravi, A. A. Mane, L. R. Naidu, Explainable
tended version), ACM Transactions on Privacy and Security artificial intelligence and causal inference based atm fraud
(TOPS) 22 (2019) 1–34. detection, arXiv preprint arXiv:2211.10595 (2022).
[86] J.-Y. Kim, S.-B. Cho, Obfuscated malware detection using [105] M. Kinkead, S. Millar, N. McLaughlin, P. O’Kane, Towards
deep generative model based on global/local features, Com- explainable cnns for android malware detection, Procedia
puters & Security 112 (2022) 102501. Computer Science 184 (2021) 959–965.
[87] G. Olani, C.-F. Wu, Y.-H. Chang, W.-K. Shih, Deepware: [106] N. McLaughlin, J. Martinez del Rincon, B. Kang, S. Yerima,
Imaging performance counters with deep learning to detect P. Miller, S. Sezer, Y. Safaei, E. Trickel, Z. Zhao, A. Doupé,
ransomware, IEEE Transactions on Computers (2022) 1–1. et al., Deep android malware detection, in: Proceedings of the
[88] W. Lian, G. Nie, Y. Kang, B. Jia, Y. Zhang, Cryptomining seventh ACM on conference on data and application security
malware detection based on edge computing-oriented multi- and privacy, 2017, pp. 301–308.
modal features deep learning, China Communications 19 [107] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller,
(2022) 174–185. W. Samek, On pixel-wise explanations for non-linear classifier
[89] A. Bensaoud, J. Kalita, Deep multi-task learning for malware decisions by layer-wise relevance propagation, PloS one 10
image classification, Journal of Information Security and (2015) e0130140.
Applications 64 (2022) 103057. [108] S. Hooker, D. Erhan, P.-J. Kindermans, B. Kim, A benchmark
[90] S. Heron, Advanced encryption standard (aes), Network for interpretability methods in deep neural networks, Ad-
Security 2009 (2009) 8–12. vances in neural information processing systems 32 (2019).
[91] S. B. Sasi, N. Sivanandam, A survey on cryptography using [109] Y.-S. Lin, W.-C. Lee, Z. B. Celik, What do you see? evalua-
optimization algorithms in wsns, Indian Journal of Science tion of explainable artificial intelligence (xai) interpretability
and Technology 8 (2015) 216. through neural backdoors, in: Proceedings of the 27th ACM
[92] M. Mahendra, P. S. Prabha, Classification of security levels SIGKDD Conference on Knowledge Discovery & Data Min-
to enhance the data sharing transmissions using blowfish ing, 2021, pp. 1027–1035.
algorithm in comparison with data encryption standard, in: [110] W. Guo, D. Mu, J. Xu, P. Su, G. Wang, X. Xing, Lemna:
2022 International Conference on Sustainable Computing and Explaining deep learning based security applications, in: pro-
Data Communication Systems (ICSCDS), IEEE, 2022, pp. ceedings of the 2018 ACM SIGSAC conference on computer
1154–1160. and communications security, 2018, pp. 364–379.
[93] C. M. Kota, C. Aissi, Implementation of the rsa algorithm [111] A. Kuppa, N.-A. Le-Khac, Black box attacks on explainable
and its cryptanalysis, in: proceedings of the 2002 ASEE Gulf- artificial intelligence (xai) methods in cyber security, in: 2020
Southwest Annual Conference, 2002, pp. 20–22. International Joint Conference on Neural Networks (IJCNN),
[94] T. R. Lee, J. S. Teh, N. Jamil, J. L. S. Yan, J. Chen, Lightweight IEEE, 2020, pp. 1–8.
block cipher security evaluation based on machine learning [112] D. Rao, S. Mane, Zero-shot learning approach to adap-
classifiers and active s-boxes, IEEE Access 9 (2021) 134052– tive cybersecurity using explainable ai, arXiv preprint
134064. arXiv:2106.14647 (2021).
[95] A. Baksi, Machine learning-assisted differential distinguishers [113] F. Zhong, X. Cheng, D. Yu, B. Gong, S. Song, J. Yu, Malfox:
for lightweight ciphers, in: Classical and Physical Security camouflaged adversarial malware example generation based
of Symmetric Key Cryptographic Algorithms, Springer, 2022, on conv-gans against black-box detectors, IEEE Transactions
pp. 141–162. on Computers (2023).
[96] S. Kok, A. Azween, N. Jhanjhi, Evaluation metric for crypto- [114] Z. Zhao, Z. Li, F. Zhang, Z. Yang, S. Luo, T. Li, R. Zhang,
ransomware detection using machine learning, Journal of K. Ren, Sage: Steering the adversarial generation of examples
Information Security and Applications 55 (2020) 102646. with accelerations, IEEE Transactions on Information Foren-
[97] Y. Ding, G. Wu, D. Chen, N. Zhang, L. Gong, M. Cao, sics and Security 18 (2023) 789–803.
Z. Qin, Deepedn: a deep-learning-based image encryption [115] W. Hu, Y. Tan, Generating adversarial malware examples for
and decryption network for internet of medical things, IEEE black-box attacks based on gan, in: Data Mining and Big Data:
Internet of Things Journal 8 (2020) 1504–1518. 7th International Conference, DMBD 2022, Beijing, China,
[98] H. Kim, J. Park, H. Kwon, K. Jang, H. Seo, Convolutional November 21–24, 2022, Proceedings, Part II, Springer, 2023,
neural network-based cryptography ransomware detection for pp. 409–423.
low-end embedded processors, Mathematics 9 (2021) 705. [116] X. Ling, L. Wu, J. Zhang, Z. Qu, W. Deng, X. Chen, Y. Qian,
[99] S. Sharmeen, Y. A. Ahmed, S. Huda, B. Ş. Koçer, M. M. C. Wu, S. Ji, T. Luo, et al., Adversarial attacks against
Hassan, Avoiding future digital extortion through robust windows pe malware detection: A survey of the state-of-the-
protection against ransomware threats using deep learning art, Computers & Security (2023) 103134.
based adaptive approaches, IEEE Access 8 (2020) 24522– [117] G. Xu, G. Xin, L. Jiao, J. Liu, S. Liu, M. Feng, X. Zheng,
24534. Ofei: A semi-black-box android adversarial sample attack
[100] F. Fischer, H. Xiao, C.-Y. Kao, Y. Stachelscheid, B. Johnson, framework against dlaas, IEEE Transactions on Computers
D. Razar, P. Fawkesley, N. Buckley, K. Böttinger, P. Muntean, (2023).
et al., Stack overflow considered helpful! deep learning [118] Y. Qiao, W. Zhang, Z. Tian, L. T. Yang, Y. Liu, M. Alazab,
security nudges towards stronger cryptography, in: 28th Adversarial elf malware detection method using model inter-
{USENIX} Security Symposium ({USENIX} Security 19), pretation, IEEE Transactions on Industrial Informatics 19
2019, pp. 339–356. (2022) 605–615.
[101] M. T. Ribeiro, S. Singh, C. Guestrin, " why should i trust you?" [119] K. Meenakshi, G. Maragatham, An optimised defensive
explaining the predictions of any classifier, in: Proceedings of technique to recognize adversarial iris images using curvelet
18
transform, Intelligent Automation & Soft Computing 35
(2023) 627–643.
[120] M. Pintor, L. Demetrio, A. Sotgiu, A. Demontis, N. Carlini,
B. Biggio, F. Roli, Indicators of attack failure: Debugging and
improving optimization of adversarial examples, Advances
in Neural Information Processing Systems 35 (2022) 23063–
23076.
19
Fig. 25: Training and testing for accuracy and loss of
EfficientnetB6
20