Zero-Day Network Intrusion Detection Using Machine Learning Approach
Zero-Day Network Intrusion Detection Using Machine Learning Approach
net/publication/374128048
Article in International Journal on Recent and Innovation Trends in Computing and Communication · August 2023
DOI: 10.17762/ijritcc.v11i8s.7190
CITATIONS READS
4 763
2 authors:
All content following this page was uploaded by Naushad Alam on 19 October 2023.
Abstract-Zero-day network attacks are a growing global cybersecurity concern. Hackers exploit vulnerabilities in network systems, making
network traffic analysis crucial in detecting and mitigating unauthorized attacks. However, inadequate and ineffective network traffic analysis
can lead to prolonged network compromises. To address this, machine learning-based zero-day network intrusion detection systems (ZDNIDS)
rely on monitoring and collecting relevant information from network traffic data. The selection of pertinent features is essential for optimal
ZDNIDS performance given the voluminous nature of network traffic data, characterized by attributes. Unfortunately, current machine learning
models utilized in this field exhibit inefficiency in detecting zero-day network attacks, resulting in a high false alarm rate and overall
performance degradation. To overcome these limitations, this paper introduces a novel approach combining the anomaly-based extended
isolation forest algorithm with the BAT algorithm and Nevergrad. Furthermore, the proposed model was evaluated using 5G network traffic,
showcasing its effectiveness in efficiently detecting both known and unknown attacks, thereby reducing false alarms when compared to existing
systems. This advancement contributes to improved internet security.
194
IJRITCC | July 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 8s
DOI: https://doi.org/10.17762/ijritcc.v11i8s.7190
Article Received: 22 April 2023 Revised: 10 June 2023 Accepted: 26 June 2023
___________________________________________________________________________________________________________________
They have the potential to detect all types of attacks, known or optimization [13], evolutionary computation [14], Support
unknown, including zero-day attacks. The fundamental Vector Machine (SVM) [15], and Benford's law with semi-
problem with anomaly-based detection techniques is that they supervised machine learning [16]. The "deep transductive
need a tuning stage and have high false-positive rates. Many transfer learning" method proposed in this research can identify
studies suggest using machine learning techniques for cyber zero-day assaults even in the absence of labeled data in the
intrusion detection to enhance the detection rate and reduce target domain. The outcomes of the experiments demonstrate
false-positive rates [3, 6, 7]. Recent research focuses on how well this approach can spot zero-day assaults on fresh data.
anomaly-based intrusion detection systems, which can be In plain language, the suggested method can identify previously
classified into three types based on the machine learning unidentified cyberattacks without the requirement for prior
methods used: supervised (classification), unsupervised information [17]. Paper [18], In their work, they make the
(clustering and anomaly-based detection), reinforcement and recommendation that autoencoders be used to create an IDS
semi-supervised techniques [6]. Supervised IDS uses labeled model that is capable of accurately detecting zero-day attacks
data to train a model [8, 9]. However, when compared to with a high recall and low false-negative rate. According to the
signature-based IDS, supervised IDS models are less efficient study, autoencoders are effective at spotting sophisticated zero-
at detecting zero-day attacks, and they require frequent day attacks. The report also emphasizes how the suggested
retraining, which is difficult to achieve because obtaining technique trades off fallout and memory. In this paper [19], a
labeled data is difficult. Semi-supervised IDS utilizes a deep learning-based method for creating a nimble and effective
combination of labeled and unlabeled data to build a model. In network intrusion detection system (NIDS) is presented. In
unsupervised IDS, clustering algorithms are employed to terms of accuracy, precision, recall, and f-measure values, the
identify anomalies in unlabeled data. These techniques aim to suggested method's performance is assessed and contrasted to
group similar data together while maintaining dissimilarity earlier methods. The final objective is to use deep learning
between clusters without relying on attack signatures, explicit methods to construct a real-time NIDS for genuine networks.
attack descriptions, or labeled data for training. Unsupervised The author [20] is to conduct a thorough examination of the
intrusion detection methods have the capability to detect both NSL-KDD dataset by extracting pertinent records and
known and unknown attacks, eliminating the requirement for comparing different machine learning classifiers. The trials'
labeled data. These methods can extract features from different findings demonstrated that of all the evaluated models, the
sources to address queries related to attribution and correlation Random Forest classifier had the highest average accuracy and
[6]. outperformed them in numerous tests. The performance of
In this paper, we introduce an unsupervised anomaly detection various classifiers on the NSL-KDD dataset is discussed in this
method that does not rely on prior knowledge. Our approach work. This study's goal is to create an intrusion detection system
offers three key contributions. Firstly, we present a unique with high detection rates and low false alarm rates. According
anomaly detection method that combines extended isolation to the experimental findings, the feature association impact
forest with the BAT algorithm. Secondly, we optimize the scale (FAIS) model with all characteristics had an accuracy of
extended isolation forest using Nevergrad. Lastly, we conduct 88%, whereas the feature correlation analysis and association
experiments to compare our proposed method with three impact scale (FCAAIS) model with optimal features had an
alternative approaches, utilizing diverse evaluation metrics. accuracy of 91%. The accuracy of FAIS was increased by 3%
Nevergrad is a Python library that offers a gradient-free with the use of canonical correlation for optimized attribute
optimization platform [39]. Its purpose is to optimize complex selection. Calculations of sensitivity, specificity, and F-measure
functions and models without relying on gradients, making it a revealed FCAAIS to have greater values than FAIS [21]. The
valuable tool for machine learning and optimization endeavors. key idea behind our investigation is to identify executable files
Users can leverage Nevergrad to minimize objective functions, connected to known vulnerabilities and their exploits. These
fine-tune hyperparameters, and execute various optimization discoveries have important ramifications for both upcoming
tasks efficiently. The scalability of Nevergrad enables its security technology and governmental initiatives. We can
application across diverse domains, facilitating optimization for strengthen security protocols and provide more robust
a wide range of applications. mitigation solutions for possible security risks by recognizing
such files. The study's conclusions can influence current and
II. RELATED WORK upcoming cybersecurity research and development projects
Numerous machine learning and data mining [22]. In order to evaluate how well machine learning-based
techniques have been suggested for cyber intrusion detection in NIDSs are able to identify zero-day attacks, this study
the past twenty years. These include ant colony optimization introduces a unique zero-shot learning technique. Despite
[10], artificial neural networks [11, 12], particle swarm strong zero-day detection rate (Z-DR) values in the majority of
195
IJRITCC | July 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 8s
DOI: https://doi.org/10.17762/ijritcc.v11i8s.7190
Article Received: 22 April 2023 Revised: 10 June 2023 Accepted: 26 June 2023
___________________________________________________________________________________________________________________
attack classes, the study's findings show that some attack produce high-quality model performance in dynamic and
categories were not reliably recognized as zero-day threats. The quickly changing situations by fusing deep learning modeling
Wasserstein Distance (WD) method, which directly connected with scalable data pre-processing [26]. This paper [27] presents
feature distributions with WD and Z-DR measures, was used to an add-on for IoT devices that detects URL-based attack using
further corroborate the findings [23]. In this study [24], 356 a convolutional neural network (CNN) model and botnet attacks
severe attacks employing an out-of-date official rule set were using recurrent neural network-long short term memory model
used to test Snort's capacity to recognize zero-day attacks. The housed on back-end servers. The add-on is intended to improve
analysis' findings demonstrated that Snort has a 17% detection IoT device micro security. A bidirectional long and short-term
rate for zero-day attacks. The reputation architecture for memory network with multi feature layer for successful attack
vehicular ad hoc networks, Clustered, is presented in this study. detection with various intervals is proposed. In comparison to
It involves the cluster chiefs and members altering pseudonyms prior methods, the model's introduction of sequence and stage
and reputation values. Studies reveal that it is more scalable and feature layers and a double-layer reverse unit results in a
efficient than competing approaches, but there is still room for decreased false positive and false negative rate [28]. A novel
improvement in terms of flexibility and resilience in the deep learning approach for intrusion detection that outperforms
dynamic vehicular ad hoc network (VANET) environment [25]. previous approaches in terms of accuracy, precision, and recall
A proactive network security strategy is presented that makes while requiring less training time The method was tested on the
use of deep learning models to identify intrusions. The machine KDD Cup '99 and NSL-KDD datasets, showing an
learning application's development and deployment are both improvement in accuracy of up to 5%. GPUs were used to build
covered by the suggested system architecture. The system can the classifier in TensorFlow [29].
CNN-SVM [30] Signature based Of Double- R2L-96.67, U2R-100 Future researchers can test the categorization’s efficacy by
layered Hybrid approach using it on a dataset or network setting with more than four
different types of attacks.
CNN- DCNN-LSTM [27] Deep Learning CNN - 94.3, F1-93.58 The suggested method could be enhanced in the future to
recognize new attacks on IoT systems and devices that use
encrypted traffic to evade detection or hide their activities.
Machine Learning Semi Supervised Machine Correlation coefficient- a system that combines several feature selection strategies with
Model for NIDS [31] Learning 74 & F1-score-85 Machine Learning classifiers for improved performance
requirements additional research.
Deep Learning [19] NDAE for unsupervised 98.81 In upcoming research, the researcher aims to enhance the
learning model’s ability to detect zero-day attacks and evaluate it further
using world -backbone network traffic.
Unsupervised learning [32] Deep learning based K-Mean -97.6 ALAD In this paper, K-Mean and SOM are reliable, but ALAD is
unsupervised learning – 89.9 SOM -96.1 better at detecting rare attacks by using adversarial samples,
algorithm: K-Mean, SOM, and DAGMM doesn’t perform well. Test the algorithm on
DAGMM and ALAD additional dataset and combine for network flow anomaly
detection.
III. DATASET DESCRIPTION datasets are outdated and may not be suitable for analyzing
The 5G-NIDD dataset [33] is a comprehensive modern networks due to significant technological
labeled dataset generated from a functional 5G test network. advancements. However, the 5G-NIDD [33] dataset is a
Its purpose is to facilitate the identification and detection of recent compilation that incorporates real 5G networks. It
malicious content within network traffic. This dataset encompasses prevalent attacks, such as different port scans
comprises substantial amounts of data collected from actual and a diverse range of DoS/DDoS attacks.
networks. A recent survey conducted shows a brief overview
of the datasets available till 2020 that are useful for evaluating
intrusion detection on networks [34]. Many of the existing
196
IJRITCC | July 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 8s
DOI: https://doi.org/10.17762/ijritcc.v11i8s.7190
Article Received: 22 April 2023 Revised: 10 June 2023 Accepted: 26 June 2023
___________________________________________________________________________________________________________________
IV. METHODOLOGY BATEIF ALGORITHM
This research consists of two main parts. The first 1. Initialization: Randomly initialize a
part is feature selection using a metaheuristic algorithm BAT population of n_bats, each represented by
by training the Extended Isolation Forest (EIF). The second a binary vector of length N. Also initializes
part is training the model using selected features and tuning the velocity v of each bat to zero and sets
the hyperparameters with the Nevergrad optimizer for the initial fitness values to zero.
evaluating the new methodology. Further comparing it with 2. Frequency and velocity update:
other methods to determine its effectiveness. figure 1 show f[j] = A * exp(-ri) * cos(2pi*rand()) +
the overview of our methodology. gamma
v[j] += (bats[j] - bats.mean(axis=0)) * f[j]
3. Position update:
bats += v
4. Loudness and pulse rate update:
loudness = alpha * loudness
pulse_rate = exp(-gamma*i)
5. Fitness evaluation: The fitness of each bat
is evaluated using a fitness function that
Fig 1: Overview of purposed methodology measures the accuracy of an EIF trained on
This study aims to select important features from the subset of features selected by the
network packets using BAT-based optimization as a wrapper binary vector representation of the bat.
classifier. The selected subset of features is the output of the 6. Update of best solution: If a bat's fitness
BATEIF algorithm, which improves the detection capacity of value is better than the current best fitness
the system. Further, we use Nevergrad to optimize value, the bat's binary vector
hyperparameters for the best results. The presented system is representation is set as the new best
then tested and evaluated for its effectiveness in increasing solution.
detection accuracy and reducing false alarms. The 5G NIDD 7. Termination: The algorithm terminates
dataset is used as a benchmark to assess the system's after a fixed number of iterations, and the
performance. final solution is the binary vector
BAT is a metaheuristic optimization algorithm that representation of the best bat.
draws inspiration from how bats use echolocation to find 8. The final step is to select the features
prey. The key aspects of this behavior are simplified as corresponding to the binary vector
follows [35]: representation of the best bat and return
● Bats use echolocation to detect the distance to them as the selected features.
obstacles and prey. Here, N is the number of features in the input data.
● Bats fly at random using a velocity 𝑣𝑖 and emit In each iteration of the algorithm, the frequency f and velocity
pulses with a frequency 𝑓𝑚𝑖𝑛 , wavelength λ and v of each bat are updated according to the above equations.
loudness 𝐴0 to find prey. Bats are able to adjust the After that, i is the current iteration number, j is the index of
frequency and rate of their echolocation pulses the current bat, A is a constant representing the initial
based on how close they are to obstacles in their loudness of the bat's calls, r and gamma are constants
environment. This adjustment happens controlling the decay rates of loudness and frequency, and
spontaneously. The pulse emission rate is a value rand() generates a random number between 0 and 1. The
that ranges from 0 to 1. position of each bat is then updated by adding its velocity
● The loudness of the bat's echolocation varies from a vector to its current position. In the next step, the loudness
high value of 𝐴0 to a minimal value 𝐴𝑚𝑖𝑛 . and pulse rate of each bat are updated according to the
equations. Alpha is a constant that controls the rate of
loudness decay.
The BAT algorithm has some advantages that make
it a useful tool for solving classification and time series
prediction problems. Here are a few of these advantages [20].
First, the BAT uses echolocation and frequency tuning to
197
IJRITCC | July 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 8s
DOI: https://doi.org/10.17762/ijritcc.v11i8s.7190
Article Received: 22 April 2023 Revised: 10 June 2023 Accepted: 26 June 2023
___________________________________________________________________________________________________________________
adjust its behavior during the problem-solving process. The VI. RESULT AND DISCUSSION
second allows it to adjust the frequency of its pulses to fine- This section outlines the implementation of the Extended
tune its search. The BAT can automatically zoom in on areas Isolation Forest (EIF) and OneClass SVM models for the
where potentially better solutions might be found. The BAT detection of zero-day network attacks. Prior to training, the
algorithm possesses the advantage of rapidly converging on dataset is divided into two portions: a training dataset and a
optimal solutions during the initial stages of the iteration testing dataset, utilizing an 80/20 ratio. After that, we used all
process. Unlike several other algorithms, the BAT algorithm the features selected by BATEIF to train our model as well as
incorporates parameter control, enabling automatic tune hyperparameters using Nevergrad. Further, we used a
adjustment of parameter values (A and r) throughout the well-known evaluation matrix named precision, F1 score,
iterations. This adaptive capability facilitates a seamless recall, and accuracy.
transition from exploration to exploitation, enhancing the Accuracy = (Number of correctly classified instances) /
algorithm's effectiveness in searching for the best solution. (Total number of instances)
Precision = TP/TP+FP
V. DATA PREPROCESSING AND FEATURE
Recall = TP/TP+FN
SELECTION
F1 score = 2*Precision*Recall/Precision+Recall
Machine learning intrusion detection systems (ML-
Fig. 2 shows a bar graph of two machine learning models: the
NIDS) use input data called features to detect zero-day
EIF model with and without BATEIF. The first model (using
network attacks [36]. ML-NIDS (Machine Learning-based
BATEIF) has an accuracy of 99%. The second model
Network Intrusion Detection Systems) can enhance their
(without BATEIF) has an accuracy of 58%. In this case, the
performance by leveraging crucial features that differentiate
first model is better than the second model. This is because
normal and anomalous network traffic.
the first model is more accurate and has a higher detection
Network traffic analysis (NTA) is a crucial
rate.
component of Network Intrusion Detection Systems (NIDS)
Fig. 3 shows a bar graph of two machine learning models: the
that involves capturing and analyzing network traffic data. Its
OneClass SVM model with and without BATEIF. The first
primary objective is to identify and detect various threats,
model (using BATEIF) has an accuracy of 90% where as 62%
including zero-day network attacks. Nonetheless, the real-
accuracy is achieved by the second model (without BATEIF).
time extraction of significant features from network traffic
In this case, the first model is better than the second model.
data presents a challenge. Significance is attributed to a
This is because the first model is more accurate and has a
network traffic feature if it demonstrates the capability to
higher detection rate.
distinguish between normal and malicious traffic. The
Fig. 4 shows A bar graph of two machine learning
information regarding significant features is sourced from
models has been shown: the EIF and the OneClass SVM
references [37, 38]. Our study aimed to tackle the issue of
model with BATEIF. The first model has 99% accuracy
effectively extracting significant features to detect unfamiliar
where as the accuracy of the second model is 90%. In this
malicious attacks. In our approach, we first remove the
situation, the first model (EIF) outperforms the second model.
duplicate data, dropping some columns due to the maximum
This is due to the fact that the first model is more accurate
data present in columns being zero and categorical features
and has a greater detection rate.
available in the dataset. So, encode these columns using one
hot encoding. Secondly, we are checking the zero variance 100
and Pearson correlation on all features and dropping 80
redundant features because these features reduce the accuracy 60
of our model. After that, we applied the chi-2 test to feature
40
20
importance to check the contribution of every feature. 0
Further, we have applied BATEIF to identify the most EIF without
EIF with BATEIF
important features (selected features by BATEIF) from the BATEIF
dataset by searching for a subset of features that maximizes
Accurecy 99 58
the performance of the model. How well an ZDNIDS
performs in terms of precision, recall, and F1 score Precision 99 59
determines the efficiency of the features chosen. The next Recall 99 99
section discusses the implementation of various ML models
(EIF and OneClass SVM) for detecting zero-day attacks.
F1 Score 99 72
Fig 2: Comparative analysis between EIF model with and without BATEIF
198
IJRITCC | July 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 8s
DOI: https://doi.org/10.17762/ijritcc.v11i8s.7190
Article Received: 22 April 2023 Revised: 10 June 2023 Accepted: 26 June 2023
___________________________________________________________________________________________________________________
learning algorithm that has been shown to be effective at
detecting outliers. These data points are significantly
different from the rest of the data. The extended isolation
forest works by first building a forest of decision trees. Each
decision tree is used to classify data points as either normal
or anomalous. The BAT algorithm with the extended
isolation forest was evaluated on a 5G NIDD that contained
both known and unknown attacks. We compared the
performance of EIF and OneClass SVM using selected
features by BATEIF and without. The results showed that the
EIF, with the Nevergrad and BAT algorithms, is able to detect
99% of the known and unknown attacks. The results of this
study show the BAT algorithm and the extended isolation
Fig 3: Comparative analysis between OneClass SVM model with and forest combined are effective at detecting intrusions. The
without BATEIF
Table 2: Selected features when training/testing against the 5G NIDD dataset
BAT algorithm is more effective for feature selection, while
the extended isolation forest is more effective at detecting
Dataset Selected Feature Index Total selected
Name Feature
attacks. On the other hand, OneClass SVM takes too much
time to train, although EIF does not.
5G NIDD 1,3,6,9,10,16,25,26,28,29,31,38,40,4 24
2,47,52,56,57,58,60, 61,62, 64,65
VII. CONCLUSION
This research aims to present an effective and better
way to detect the zero-day attacks. In this approach, firstly,
100
BATEIF, a novel metaheuristic algorithm based on the binary
80
version of the BAT algorithm, is presented for feature
60
40 selection purposes. The first goal was used as a criterion to
20 evaluate various methods for improving the quality of
0 selected features: the number of features, the false-positive
OneClass SVM rate, and the rate of detection. The goal of choosing an
EIF With BATEIF
With BATEIF excellent characteristic subset to train EIF and OneClass
SVM that conduct intrusion detection. Finally, we tested on
Accurecy 99 90
the most recent 5G NIDD dataset, and the results were great.
Precision 99 90 F1 score with 99% accuracy, precision, recall, and recall.
Recall 99 90 Attackers are always developing new and complicated
techniques to attack weaknesses, making it difficult for IDS
F1 Score 99 86
to keep up. The great results show the study's contributions to
Fig 4: Comparative analysis between OneClass SVM model and EIF with providing a better IDS.
BATEIF
In general, the network features identified by REFERENCES
BATEIF have demonstrated superior performance compared [1] T. Shon and J. Moon, “A hybrid machine learning approach
to existing systems. However, certain features, such as the to network anomaly detection,” Inf. Sci., vol. 177, no. 18, pp.
source port, destination port, and seq are not effective in 3799-3821, Sept. 2007, doi:10.1016/j.ins.2007.03.025.
detecting zero-day network attacks and consequently have a [2] IBM, “Security X-force threat intelligence index 2023”.
negative impact on the performance of our model. Available at:
The BATEIF algorithm with the Nevergrad https://www.ibm.com/downloads/cas/DB4GL8YM.
optimizer is a novel algorithm for intrusion detection that has [3] A. L. Buczak and E. Guven, “A survey of data mining and
machine learning methods for cyber security intrusion
been shown to be effective at detecting both known and
detection,” IEEE Commun. Surv. Tutor., vol. 18, no. 2, pp.
unknown attacks. The algorithm works by first identifying 1153-1176, 2016, doi:10.1109/COMST.2015.2494502.
anomalous behavior in network traffic. This anomalous [4] A. Mukkamala et al., “Cyber security challenges: Designing
behavior is then used to build a model of the attack. The efficient intrusion detection systems and antivirus tools” in
model is then used to detect new attacks that are similar to the Proc. of Enhancing Computer Security with Smart
known attacks. The extended isolation forest is a machine Technology, New York, NY, USA, 2005, pp. 125-163.
199
IJRITCC | July 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 8s
DOI: https://doi.org/10.17762/ijritcc.v11i8s.7190
Article Received: 22 April 2023 Revised: 10 June 2023 Accepted: 26 June 2023
___________________________________________________________________________________________________________________
[5] A. Sundaram, ‘An introduction to intrusion detection,’ [22] L. Bilge and T. Dumitraş, Before We Knew It: An Empirical
Crossroads, vol. 2, no. 4, Apr. 1996, pp. 3-7. Study of Zero-Day Attacks in the Real World, Proc. ACM
[6] A. Nisioti et al., “From intrusion detection to attacker Conf. Comput. Commun, 2012, pp. 833-844,
attribution: A comprehensive survey of unsupervised doi:10.1145/2382196.2382284.
methods,” IEEE Commun. Surv. Tutor., vol. 20, no. 4, pp. [23] Thomas Wilson, Andrew Evans, Alejandro Perez, Luis Pérez,
3369-3388, 2018, doi:10.1109/COMST.2018.2854724. Juan Martinez. Machine Learning for Anomaly Detection and
[7] P. Casas et al., “Unsupervised network intrusion detection Outlier Analysis in Decision Science. Kuwait Journal of
systems: Detecting the unknown without knowledge,” Machine Learning, 2(3). Retrieved from
Comput. Commun., vol. 35, no. 7, pp. 772-783, 2012, http://kuwaitjournals.com/index.php/kjml/article/view/207
doi:10.1016/j.comcom.2012.01.016. [24] M. Sarhan et al., “From zero-shot machine learning to zero-
[8] I. Kang et al., “A differentiated oneclass classification method day attack detection,” Int. J. Inf. Secur., vol. 2023, no. Jun.,
with applications to intrusion detection,” Expert Syst. Appl., 2019, doi:10.1007/s10207-023-00676-0.
vol. 39, no. 4, pp. 3899-3905, 2012, [25] H. Holm, “Signature based intrusion detection for zero-day
doi:10.1016/j.eswa.2011.06.033. attacks: (Not) A closed chapter?,” Proc. Annu. Hawaii Int.
[9] F. Kuang et al., “A novel hybrid KPCA and SVM with GA Conf., Syst. Sci., pp. 4895-4904, 2014,
model for intrusion detection,” Appl. Soft Comput., vol. 18, doi:10.1109/HICSS.2014.600.
pp. 178-184, 2014, doi:10.1016/j.asoc.2014.01.028. [26] J. Wang et al., “ClusterRep: A cluster-based reputation
[10] L. Wang and J. Shen, “A systematic review of bio-inspired framework for balancing privacy and trust in vehicular
service concretization,” IEEE Trans. Serv. Comput., vol. 10, participatory sensing,” Int. J. Distrib. Sens. Netw., vol. 14, no.
no. 4, pp. 493-505, 2017, doi:10.1109/TSC.2015.2501300. 9, 2018, doi:10.1177/1550147718803299.
[11] L. Wang et al., “Feed-back neural networks with discrete [27] G. Nguyen et al., “Deep learning for proactive network
weights,” Neural Comput. Appl., vol. 22, no. 6, pp. 1063- monitoring and security protection,” IEEE Access, vol. 8, pp.
1069, 2013, doi:10.1007/s00521-012-0867-8. 19696-19716, 2020, doi:10.1109/ACCESS.2020.2968718.
[12] H. Hindy, et al., “Utilising deep LearningTechniques for [28] G. De La Torre Parra et al., “Detecting Internet of Things
effective zero-day attack detection,” Electronics, vol. 9, no. attacks using distributed deep learning,” J. Netw. Comput.
10, p. 1684, 2020, doi:10.3390/electronics9101684. Appl., vol. 163, no. Oct., 2020,
[13] L. Wang and J. Shen, “Data-intensive service provision based doi:10.1016/j.jnca.2020.102662.
on particle swarm optimization,” Int. J. Comp. Intell. Syst., [29] X. Li et al., “Detection of low-frequency and multi-stage
vol. 11, no. 1, pp. 330-339, 2018, doi:10.2991/ijcis.11.1.25. attacks in industrial Internet of things,” IEEE Trans. Veh.
[14] M. Sadiq and A. Khan, “Rule-based network intrusion Technol., vol. 69, no. 8, pp. 8820-8831, 2020,
detection using genetic algorithms,” Int. J. Comput. Appl., doi:10.1109/TVT.2020.2995133.
vol. 18, no. 8, pp. 26-29, 2011, doi:10.5120/2303-2914. [30] S. Moraboena et al., “A deep learning approach to network
[15] C. Wagner et al., “Machine learning approach for IP-flow intrusion detection using deep autoencoder,” Rev. Intell.
record anomaly detection,” Lect. Notes Comput. Sci., vol. Artif., vol. 34, no. 4, pp. 457-463, 2020,
6640, pp. 28-39, 2011, doi:10.1007/978-3-642-20757-0_3. doi:10.18280/ria.340410.
[16] I. Mbona and J. H. P. Eloff, “Detecting zero-day intrusion [31] T. Wisanwanichthan and M. Thammawichai, “A double-
attacks using semi-supervised machine learning layered hybrid approach for network intrusion detection
approaches,”, IEEE Access, vol. 10, 69822-69838, system using combined naive Bayes and SVM,” IEEE
doi:10.1109/ACCESS.2022.3187116. Access, vol. 9, pp. 138432-138450, 2021,
[17] N. Sameera and M. Shashi, “Deep transductive transfer doi:10.1109/ACCESS.2021.3118573.
learning framework for zero-day attack detection,” ICT [32] I. Mbona and J. H. P. Eloff, “Detecting zero-day intrusion
Express, vol. 6, no. 4, pp. 361-367, 2020, attacks using semi-supervised machine learning approaches,”
doi:10.1016/j.icte.2020.03.003. IEEE Access, vol. 10, no. Apr., pp. 69822-69838, 2022,
[18] H. Hindy et al., “‘Utilising deep learning techniques for doi:10.1109/ACCESS.2022.3187116.
effective zero-day attack detection,’ Electron.,”, Electronics, [33] M. A. Kabir and X. Luo, “Unsupervised learning for network
vol. 9, no. 10, pp. 1-16, 2020, flow based anomaly detection in the era of deep learning,”
doi:10.3390/electronics9101684. Proc 6th Int. Conf. Big Data Comput. Serv. Appl. Big Data
[19] N. Altwaijry et al., “A Deep Learning Approach for Service 2020, vol. August 2020. IEEE, 2020, pp. 165-168,
Anomaly-Based Network Intrusion Detection” Commun. doi:10.1109/BigDataService49289.2020.00032.
Comput. Inf. Sci., vol. 1210 CCIS, pp. 603-615, 2020, [34] S. Samarakoon et al., 2022, 5G-NIDD: A comprehensive
doi:10.1007/978-981-15-7530-3_46. network intrusion detection dataset generated over 5G
[20] D. P. Gaikwad and R. C. Thool, “‘Online Anomaly Based wireless network. arXiv preprint arXiv:2212.01298.
Intrusion Detection System Using Machine Learning,’ i- [35] K. Shaukat et al., “A survey on machine learning techniques
manager’s J. Cloud Comput,”, JCC, vol. 1, no. 1, pp. 19-25, for cyber security in the last decade,” IEEE Access, vol. 8, pp.
2014, doi:10.26634/jcc.1.1.2800. 222310-222354, 2020, doi:10.1109/ACCESS.2020.3041951,
[21] D. Oladimeji, “‘an Intrusion Detection System for Internet P. 222.
of,’ no,” Jun., pp. 1-25, 2021.
200
IJRITCC | July 2023, Available @ http://www.ijritcc.org
International Journal on Recent and Innovation Trends in Computing and Communication
ISSN: 2321-8169 Volume: 11 Issue: 8s
DOI: https://doi.org/10.17762/ijritcc.v11i8s.7190
Article Received: 22 April 2023 Revised: 10 June 2023 Accepted: 26 June 2023
___________________________________________________________________________________________________________________
[36] X.-S. Yang, “A new metaheuristic bat-inspired algorithm” in
Nature Inspired Cooperative Strategies for Optimization
(NICSO 2010). Berlin, Germany: Springer, 2010, pp. 65-74,
doi:10.1007/978-3-642-12538-6_6.
[37] R. Abdulhammed et al., “Features dimensionality reduction
approaches for machine learning based network intrusion
detection,” Electronics, vol. 8, no. 3, p. 322, Mar. 2019,
doi:10.3390/electronics8030322.
[38] E. Druică et al., “Benford’s law and the limits of digit
analysis,” Int. J. Acc. Inf. Syst., vol. 31, pp. 75-82, Dec. 2018,
doi:10.1016/j.accinf.2018.09.004.
[39] M. F. Umer et al., “Flow-based intrusion detection:
Techniques and challenges,” Comput. Secur., vol. 70, pp.
238-254, Sept. 2017, doi:10.1016/j.cose.2017.05.009.
[40] G. Biau et al., “Nevergrad – A gradient-free optimization
platform,” J. Mach. Learn. Res., vol. 21, no. 34, pp. 1-6, 2020.
201
IJRITCC | July 2023, Available @ http://www.ijritcc.org