SOA - A Malware Detection System Using A Hybrid Approach of Multi-Heads Attention-Based Control Flow Traces and Image Visualization
SOA - A Malware Detection System Using A Hybrid Approach of Multi-Heads Attention-Based Control Flow Traces and Image Visualization
• Figure 3 depicts the reverse engineering procedure for retrieving Java codes and DEX files.
• To reverse-engineer the application, we would need its APK.
• The APK Extractor file explorer is used to open the extracted APKs folder in the Internal
Storage directory.
• The chosen APKs are copied to system storage so they can be further processed. These
APKs are then reversed to reveal the code.
• This can help us understand the structure of the code and identify the security measures
they have implemented to avoid a reverse engineering attack.
• The [app].apk file is renamed to [app].zip and then unzip it up and retrieve it. The
classes.dex file, which includes the app code, can be found within the retrieved repository.
• A Dalvik Executable, or DEX file, is an executable file that runs on the Android OS and
contains the compiled script.
• The Jadx decompiler is then used to decompile the DEX file to extract the Java codes. In the
proposed work, the java programming codes and DEX files are used together to extract
features [21]. The reverse engineering process is shown in Algorithm 1.
• Graph-based methods focus on analyzing API
Proposed Call Graphs (ACGs) instead of the whole
methodology CFG, which makes the analysis faster and
more specific to detecting malicious behavior.
Proposed methodology
• A copy of the image is generated that is identical in terms of scaling and rotation.
• The combination of the BRIEF descriptor and FAST extractor is used to highlight features.
• Stacked Generalization (Ensemble Learning)
methodology
combined with a meta-learner (Logistic Regression).
• This ensemble method improves the accuracy by learning
the best combination of predictions from different models.
Results and
Discussion
• Fig. 7. Individual learners are the level-0 learners, and the combiner is the level-1 learner. Following is
specific information regarding the stacked generalization.
• 1Level-0: This is also known as base-learner. The deep features are divided into training and testing sets,
and the training set is then used to generate base learners via base learning models. We combine several
models to work as a base-learner, including Gaussian Naïve Bayes (GNB), Support Vector Machine (SVM)
with Radial Basis Function (RBF), Decision Tree (DT), Random Forest (RF), K-Nearest Neighbor (KNN), and
Multi-Layer Perceptron (MLP). Using out-of-sample data, the prediction is made for each base learner.
• 2Level-1: This is also known as meta-learner. The outcome of the base learners is fed into the meta learner’s
data, and a single meta-learner learns to make accurate malware detection from this data. We used Logistic
Regression (LR) as a metal learner. To prevent overfitting, the meta-learner is trained on a different dataset
than the instances used to train the base learners. The testing part of the deep features is used to train the
metal learner.
• When compared to individual models, we achieve better malware detection and classification results. It is
capable of optimizing the best linear combinations of models. This enables us to obtain the optimal blend of
diversity from each model and achieve the highest level of detection accuracy. However, the computation
time for a stacked ensemble is longer than for any single model. Algorithm 4 depicts the process of
detecting malware using hybrid features.
• The experimentation setup and evaluation of results are performed to ensure the
effectiveness of the projected system
Results and
• We prepared a customized dataset from CIC-InvesAnd-Mal2019 [31] by using reverse
engineering and data mining tools. Originally, the dataset is available in the form of
APKs.
Discussion • It includes four types of malware such as adware, ransomware, scareware, and SMS.
Each malware type is further subdivided into 10 to 11 families. This dataset is been
compiled to install 5, 000 samples on real Android devices.
• These samples originated from 42 distinct families of 342 malicious Android apps as
shown in Table 3.
• These APKs are thoroughly analyzed to unbox and prepare our customized dataset for
effective malware detection, as shown in Table 4.
• The Java programming codes and DEX files are obtained by reverse engineering the
Android APKs.
• There are approximately 3.2K ACGs collected from adware and ransomware, and 3.4K
ACGs collected from scareware and SMS, respectively.
• Similarly, the proposed method crawls the train and texture features with 8.4K for both
adware and ransomware and 8.6K for scareware and SMS. These features are combined
further to extract deep features for improved malware classification results.
Results and
Discussion
• The comparison of the five malware detection performance measures
is shown in Table 5. The KNN model has the lowest performance with
Results and
(precision, recall, F1-score, MCC, and accuracy), (96%, 98%, 97%,
97.42%, and 97.12%), respectively.
Discussion
• However, the proposed ensemble model performs best in terms of
(precision, recall, F1-score, MCC, and accuracy), with (99%, 99%, 99%,
99.14%, and 99.27%).
• While the MLP comes in second place after the ensemble.
Results
and
Discussion
The performance comparison for malware classification is shown in Table 7. The ensemble
provides the best classification results, with precision, recall, F1-score, MCC, and accuracy of
100%, 98%, 98%, 98.52%, and 99.17%, respectively. While the SVM-rbf achieves the lowest
• Figure 9 shows the training and testing epoch curves for malware classification
using accuracy, loss, precision, and recall.
• In part a using training data, the accuracy curve starts from 50% and gradually
increases to reach 83% on the 20th epoch.
• Further, it moves up and reaches 98% in the 40th epoch. After that, it is more or
less constant. Conversely, the loss starts from 75% and then drops gradually up
to 20% in the 22nd epoch.
• Further, it is more or less constant after the 40th epoch and drops up to 4%. The
precision and recall behave close to accuracy which indicates that the proposed
approach performs better for training data.
• In part b, the same performance measures are shown for testing data. The
accuracy, precision, and recall behave abruptly sometimes but provide the best
performance.
• There is a slight drop up to 75% and an increase in loss up to 32%, but after that,
they behave normally.
Results and
Discussion
• Figure 10 depicts the malware classification for each type of malware, namely adware,
ransomware, scareware, and SMS.
• The precision, recall, and F1-score are indicated by the blue, orange, and gray colours.
The recall is lowest when using base and meta learners, while the F1-score is the best.
• However, accuracy yields the best results for ransomware and scareware when using
ensemble, while it yields the worst results for adware when using LR and SVM-rbf.
• There is a drop in accuracy and F1-score of up to 84% when using SVM-rbf for adware,
indicating that this base learner provides the worst classification results. The ensemble
produces the best results overall.
Results and
Discussion
Results and
ensemble has the highest.
• For instance, the classification results for
Discussion
adware, ransomware, scareware, and SMS
are 93%, 93%, 92%, and 97%, respectively,
whereas the ensemble has 100%, 98%, 98%,
and 100% for the same classes.
• It is shown that the proposed hybrid results
using the ensemble model outperform the
base learners for each malware variant.
• Table 8 summarizes the performance of the proposed approach for
adware families, which include dowgin, ewind, feiwo, gooligan,
kemoge, koodous, mobidash, selfmite, shuanet, and youmi.
• When compared to others, the feiwo, kudous, and shuanet have
the best classification results.
• For feiwo, kudous, and shuanet, the precision, recall, and f-score
are (99%, 100%, 100%), (100%, 100%, 100%), and (100%, 99%,
100%), respectively.
• However, kemoge and youmi produce the fewest results.
• For instance, the precision, recall, and F1-score for kemoge and
youmi are (97%, 96%, 96%), (98%, 96%, 97%), and (98%, 96%,
97%), respectively.
The paper introduces a new
method to detect malware by
combining two techniques:
•ACGs (API Call Graphs): These
graphs represent the behavior of
an app by tracking its API calls.
•Malware Images: The app’s code
is converted into an image, and
features are extracted from this
image.
Conclusion
Conclusi
on
• Reverse Engineering: To
analyze an app, its DEX file
(compiled code) and Java
source code are extracted.
• Creating ACGs: API calls are
collected from the app’s control
flow graphs (CFGs) to create
ACGs, which act like a digital
fingerprint of the app’s activity.
• Attention-Based Transfer
Learning: This method uses
multiple heads (like focusing on
different parts of the data) to
extract important features from
ACGs.
Conclusion
• Combining Features: The features from ACGs and
malware images are combined to improve malware
detection accuracy.
• High Accuracy: The proposed method achieves a
high classification accuracy of 99.27% using a
specific dataset. It outperforms other methods,
including one that uses BERT-base with texture
features, which has a 98.52% accuracy.