9 - MLIDS_Revolutionizing_of_IoT_based_Digital_Security_Mechanism_with_Machine_Learning_Assisted_Intrusion_Detection_System

Uploaded by

sahirij611

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

9 - MLIDS_Revolutionizing_of_IoT_based_Digital_Security_Mechanism_with_Machine_Learning_Assisted_Intrusion_Detection_System

Uploaded by

sahirij611

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2024 International Conference on Automation and Computation (AUTOCOM)

MLIDS: Revolutionizing of IoT based Digital

Security Mechanism with Machine Learning
Assisted Intrusion Detection System
2024 International Conference on Automation and Computation (AUTOCOM) | 979-8-3503-8272-3/24/$31.00 ©2024 IEEE | DOI: 10.1109/AUTOCOM60220.2024.10486179

Vijaya Vardan Reddy S P S.Priskilla Manonmani, C. Anitha

Department of ECE, Department of Information Technology, Department of CSE,
R.M.K. Engineering College, Meenakshi Sundararajan Engineering Saveetha School of Engineering,
Chennai, Tamil Nadu, India. college SIMATS,
[email protected] Chennai, Tamil Nadu, India. Chennai, Tamil Nadu, India
[email protected] [email protected]

D. Jaganathan R.Reena M. Suresh

Department of Artificial Intelligence, Department of Computer Science and Department of Mathematics
Madanapalle Institute of Technology & Engineering, R.M.D. Engineering College,
Science, Prince Shri Venkateshwara Chennai, Tamil Nadu, India.
Andhra Pradesh, India. Padmavathy Engineering College, [email protected]
[email protected] Chennai, Tamil Nadu, India.
[email protected]

Abstract—It is becoming more critical to detect intrusions on electronic means [1]. The ease with which these devices can
these devices due to the exponential growth of the Internet of be monitored and controlled from afar has spurred fast
Things (IoT) and the accompanying explosion in the number of innovation in the development of numerous new applications
IoT devices. In order to create intrusion detection systems that across many fields, including smart home technology,
really work, researchers are utilizing machine learning wearable technology, health monitoring, energy management,
methods. Machine Learning Assisted Intrusion Detection connected industrial as well as manufacturing sensors and
System (MLIDS) is a new intrusion detection system that we equipment, and many more. Handling device security and
provide in this study. It efficiently identifies network unusual
protecting data from threats is the main problem in IoT
traffic. A cross-validation test using the traditional learning
systems. Cyber assaults are defined as "the deliberate and
model XGBoost allows for a transparent evaluation of the
suggested algorithm's performance. Next, the preprocessed data
malicious use of cyberspace to compromise the computer
is categorized using the suggested MLIDS and XGBoost systems, networks, or personal information of another person
methods. To get the best detection performance, the model's or entity" [2]. Due to device and protocol heterogeneity,
hyperparameters are tuned using optimization logic. The device resource restrictions, and direct internet exposure,
evolution of cybercrime has necessitated massive advancements protecting IoT devices against attacks is challenging.
in intrusion detection system (IDS) technology. In order to gain Smart cities, smart houses, smart automobiles, as well as
access to our computers' private data, hackers nowadays deploy
intelligent industrial systems are just a few of the forthcoming
a wide variety of techniques. To protect against these threats,
applications that are expected to propel the internet of things
there are a plethora of intrusion detection algorithms. There are
growing worries over the secure communication and protection to 50 billion by 2020. Malicious actors may take advantage of
of digital information due to the exponential expansion and this expansion, which poses a significant threat to the
usage of the internet. To gain useful information, hackers availability, privacy, and integrity of data [3]. Data and
nowadays deploy a wide variety of techniques. Those various privacy protection are also important aspects of cyber-
assaults may be detected with the use of several intrusion security, which aims to prevent illegal access to systems and
detection algorithms, methods, and approaches. The networks. Many new applications are being created that rely
overarching goal of this paper is to present a comprehensive on linked devices; therefore there has been a growing
analysis of intrusion detection systems, including but not limited emphasis on Internet of Things security in recent years. Smart
to: different types of intrusion identification techniques, types of homes, smart farms, healthcare, and many more have all been
events, a number of approaches, and tools, future research revolutionized by the fast expansion of the Internet of Things.
requirements, difficulties and, subsequently the development of The function of Internet of Things devices in people's daily
an IDS device for research purposes that can detect and prevent lives is vital. Nevertheless, these gadgets are vulnerable to a
intrusions. range of security threats due to their wide internet access [4]
[11]. Internet of Things devices, for instance, is vulnerable to
Keywords—MLIDS, Internet of Things, IoT, Digital Security, a plethora of network threats since they share data via the
Machine Learning, Intrusion Detection System, IDS, XGBoost, internet. Threats to linked devices have become more pressing
Data Security, Server Protection
as the Internet of Things (IoT) gained traction. Numerous
I. INTRODUCTION threats, including denial of service, eavesdropping, as well as
privilege escalation, can target IoT devices. Consequently,
The term "Internet of Things" refers to a system of safeguarding IoT devices in these types of assaults is taking
interconnected computer networks that allows everyday on more significance. The dispersed nature of IoT devices also
objects to exchange data and instructions with one another via makes them ideal targets for hackers. Additionally, the system

979-8-3503-8272-3/24/$31.00 ©2024 IEEE 277

Authorized licensed use limited to: VIT University. Downloaded on October 26,2024 at 14:14:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Automation and Computation (AUTOCOM)

is vulnerable to cyber attacks such as web injection, which Accurately identifying intrusions is getting more
might cause the disclosure of sensitive information or data challenging due to the rising sophistication of cyber-attacks
manipulation [5]. This is because the real-time [7]. If the attacks are not prevented, the authority of security
communication of the many devices in the system relies on services, including data availability, integrity, and
wireless networks, which are susceptible to eavesdropping. confidentiality, might be compromised. When it comes to
The Internet of Things needs more robust intrusion detection computer security, there are a lot of different intrusion
systems [12]. detection technologies out there. Two main categories are
intrusion detection systems based on signatures as well as
Various applications have made extensive use of IoT intrusion detection systems based on anomalies (AIDS). An
devices and networks in recent years. When it comes to exhaustive analysis of significant recent publications,
protecting an organization's computer network, intrusion taxonomy of modern IDS, and a review of the datasets
detection systems is a common tool. An intrusion detection frequently utilized for evaluation purposes are all included in
system (IDS) is a viable and effective method for detecting this survey study [7]. Also covered are the methods attackers
assaults and guaranteeing network security by protecting employ to evade detection and the research challenges that lie
against malicious hackers. When unanticipated events occur ahead in the fight against these methods, with the ultimate goal
on a local or global scale, that compromises the availability, of making computer systems more resistant to attacks [7].
confidentiality, or integrity of a network, we say that there has
been an incursion. The packets that make up the network Both the size of networks and the data associated with
traffic include header fields that provide information about them have grown exponentially due to the fast development
them. The goal of anomaly detection can be defined by of the internet and other communication technologies.
features associated with such events. An intrusion detection Consequently, network security is facing difficulty in properly
system is designed to enhance CIA by detecting and detecting breaches due to the proliferation of innovative
preventing both active and passive network intruder behaviors assaults [8]. In addition, it is impossible to disregard the
that are considered suspicious. The following are the steps that existence of the intruders who intend to conduct a variety of
need to be followed in order to construct the system: assaults within the network. One such tool is an intrusion
detection system, which checks network traffic for signs of
(i) Construct a test bed to mimic an Internet of Things (IoT) intrusion and takes other precautions to keep the network
environment; secure, private, and accessible at all times. Increasing
(ii) Create adversarial systems to launch attacks; detection accuracy while minimizing false alarm rates and
(iii) Record network traffic and extract characteristics for identifying novel intrusions are still issues for IDS, despite the
both usual and attack situations; and huge efforts of the researchers [8]. One possible method to
(iv) Create machine learning techniques to identify and efficiently identify breaches throughout the network is the
categorize network assaults. deployment of intrusion detection systems based on machine
learning as well as deep learning. After defining intrusion
II. RELATED STUDY detection systems (IDS), this article presents taxonomy built
Cyber security has emerged as a critical field of study due around the most prominent ML and DL methods used to create
to the pervasive nature of networks in contemporary society NIDSS systems. By analyzing the benefits and drawbacks of
[6]. Network software and hardware health may be tracked by the suggested solutions, this article offers a thorough overview
an intrusion detection system, a crucial tool for cyber defense. of the most current NIDS-based publications. After that, we
Problems in detecting new assaults, lowering the false alarm give the most up-to-date information on ML and DL-based
rate, and improving detection accuracy persist in current IDSs, NIDS, including the latest trends, developments, and trends in
despite decades of progress. Many academics have methodology, evaluation metrics, as well as dataset selection.
concentrated on creating IDSs that leverage machine learning We highlighted many research obstacles and recommended
techniques to address the aforementioned issues [6]. Machine future research scope for developing ML as well as DL-based
learning algorithms can accurately and automatically NIDS [8] by using the inadequacies of the presented
distinguish between typical and out-of-the-ordinary data. The approaches.
high generalizability of machine learning algorithms further
Cyber security has emerged as a critical field of study due
increases their potential to identify previously unseen threats.
to the pervasive nature of networks in contemporary society.
A subfield of machine learning, deep learning has recently
An integral part of cyber defense is the intrusion detection
attracted a lot of attention from researchers due to its
system (IDS), which keeps tabs on the health of all the
outstanding performance. For the purpose of categorizing and
network's software and hardware [9]. Problems in detecting
summarizing IDS literature based on machine learning and
new assaults, lowering the false alarm rate, and improving
deep learning, this survey suggests a taxonomy that uses data
detection accuracy persist in current IDSs, despite decades of
items as the primary dimension. Cyber security academics
progress. A lot of people have been working on intrusion
might benefit from this categorization system, in our opinion.
detection systems that employ machine learning to address the
The idea and classification of IDSs are initially defined in the
issues listed above. The key distinctions between typical and
survey. Then, we provide the machine learning techniques that
out-of-the-ordinary data may be accurately and automatically
are often employed in intrusion detection systems, metrics,
discovered using machine learning techniques [9]. The high
and benchmark datasets. After that, we use the suggested
generalizability of machine learning algorithms further
taxonomic system and the representative literature as a
increases their potential to identify previously unseen threats.
starting point, and then we show how to use ML and DL to fix
A sub-field of machine learning, deep learning has recently
important IDS problems. Finally, by looking at recent
attracted a lot of attention from researchers due to its
representative research, we may talk about the difficulties and
outstanding performance. For the purpose of categorizing and
potential future advances [6].
summarizing IDS literature based on machine learning and
deep learning, this survey suggests a taxonomy that uses data

278
Authorized licensed use limited to: VIT University. Downloaded on October 26,2024 at 14:14:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Automation and Computation (AUTOCOM)

items as the primary dimension. Cyber security academics

might benefit from this categorization system, in our opinion.
The idea and classification of IDSs are initially defined in the
survey. Then, we provide the machine learning techniques that
are often employed in intrusion detection systems, metrics,
and benchmark datasets. After that, we use the suggested
taxonomic system and the representative literature as a
starting point, and then we show how to use ML and DL to fix
important IDS problems. Finally, by looking at recent
representative research, we may talk about the difficulties and
potential future advances [9].
Even while intrusion detection systems (IDS) are great at
spotting suspicious network activity, they still have a poor
detection rate and a high false alarm rate, particularly when it
comes to anomalies that don't have many records [10]. In our
study [10], we provide DO_IDS, a hybrid data optimization
strategy that combines data sampling as well as feature
selection to create an efficient intrusion detection system. In
data sampling, the outliers are removed using the Isolation
Forest (iForest), the sampling ratio is optimized using the
genetic algorithm (GA), and the best training dataset is
obtained using the Random Forest classifier as assessment
criterion. To get the best subset of features, feature selection
uses GA and RF again. The last step is to construct an RF-
based intrusion detection system with features chosen by
feature selection and the best training set obtained through
data sampling. The dataset available at UNSW-NB15 will be
used for the experiment. There is little doubt that the model
outperforms competing algorithms when it comes to spotting
really unusual behavior [10]. Fig. 1. System Architecture

III. METHODOLOGY (i) Data Pre-Processing: Processing the raw dataset to

make it acceptable for an MLIDS algorithm is the next step.
When planning for network security, it is essential to think
Data cleansing, normalization, and standardization are all
about whether or not the intrusion detection system for
steps in this process. The process is broken down into three
Internet of Things devices can function in real-time. The
smaller parts. Level one of the process involves standardizing
processing power and storage capacity of most IoT devices are
datasets. Making ensuring the data are on the same scale as
rather low. System delays or failure to achieve real-time
well as distributed normally from 0 to 1 is a critical stage in
detection requirements may occur if these resource limits
this process. Normalizing the data is the second sub-step. The
prevent the system from efficiently processing and analyzing
data is transformed as part of the normalization process. To
massive amounts of network traffic data. Furthermore,
keep neural networks from rejecting negative values, this is a
intrusion detection systems encounter a growing strain to
crucial step to take. We set all of the dataset's values to a
identify such assaults due to the prevalence of network
normal distribution that ranges from 0 to 1. In the third stage,
assaults on IoT devices as well as the ongoing innovation of
known as data cleaning, irrelevant information like NaN as
attack strategies. There is a lot of unnecessary information and
well as null values are eliminated.
duplication introduced by creating a large volume of network
traffic data. Over-fitting occurs when a model learns to fit a (ii) Feature Selection: At this stage, the model's best
certain data set, which might reduce detection effectiveness characteristics are chosen. The model's performance is
when there are redundant characteristics in the data. In order affected by this phase, making it significant in MLIDS. The
to build a thorough IoT security model that improves the accuracy of a model will suffer if the features we utilize are
accuracy of detecting security threats compared to the existing not suitable. Consequently, we choose which elements to
model called XGBoost Algorithm, this research will follow an include in our model at this stage. The elements that were
organized approach which utilizes a Machine Learning utilized to express the time and length that impact the
Assisted Intrusion Detection System (MLIDS) model. The classification of assaults were "dur", "rate", "srate", and
following figure Fig.1 describes the research's architectural "drate".
framework.
(iii) Classification: In this stage, many models are utilized
The suggested approach is introduced in this section. to forecast the assault.
Cleaning and normalizing the dataset is the first stage.
Features with poor scores are identified and deleted using the (iv) Trained, Tested and Evaluated: Using the
Machine Learning Assisted Intrusion Detection System characteristics that were chosen, we trained the models. We
(MLIDS) model, which ranks the significance of each trained our model using 80% of data and tested it using 20%.
characteristic. An ideal subset of features may be found using So, 20% of this data set was sufficient for model training and
this iterative method. It is common practice to divide a dataset testing. Our assault prediction was spot on because of this.
into a training set and a testing set when data preparation is The following figure Fig.2 shows the general process used
complete. for attack design in this investigation.

279
Authorized licensed use limited to: VIT University. Downloaded on October 26,2024 at 14:14:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Automation and Computation (AUTOCOM)

them for incursion, the feature selection module uses the

network's enormous data sets. An example of a key for
intrusion selection would include the source and destination
system's Internet-Protocol (IP) addresses, as well as the
protocol type, header length, and size as well as one may
check for accuracy with the help of the data analysis module.
Data is analyzed by rule-based intrusion detection systems by
comparing incoming traffic to established patterns or
signatures. Anomaly based intrusion detection systems are
another option; this approach uses mathematical models to
analyze system behavior. Concerning the system's response
and assault, the detection module specifies. In addition to
alerting the system administrator by email or alarm symbols,
it may actively participate in the system by closing ports or
deleting packets to prevent entry.
The precision of the suggested Machine Learning Assisted
Intrusion Detection System (MLIDS) is shown in Fig-3. To
assess the performance of both models, it is cross-validated
with the traditional XGBoost approach. The same is
represented in the following table, Table-1 in descriptive
manner.

TABLE I. ACCURACY ANALYSIS

S.No. Epochs XGBoost (%) MLIDS (%)

1. 50 91.67 97.84
2. 75 91.72 97.31
3. 100 91.54 97.86
4. 125 91.62 97.52
5. 150 91.56 97.53
Fig. 2. Attack Design 6. 175 91.52 97.49
7. 200 91.49 97.45
IV. RESULTS AND DISCUSSIONS 8. 225 91.46 97.41
An anomalous activity based system for intrusion
identification activity and labels it as normal or abnormal. It
can identify various network and misuse using this method.
The categorization aims to identify any form of harmful
behavior that arises from regular system operation and is
based on a set of rules instead of patterns or signatures.
However, signature-based systems are limited to detecting
attacks for which they already have a signature. The potential
to identify new types of intrusions; the capacity to spot
abnormalities without delving into their origins or traits; the
fact that intrusion detection systems (IDS) rely less on the
operating system than attack signature based systems; and the
capability to identify instances of user privilege abuse. Byte
patterns in data from networks or known dangerous sequences
of instructions exploited by malware are examples of patterns
that signature based intrusion detection systems search for.
Antivirus programs coin the word "signature" to describe
these patterns they discover. Unfortunately, new assaults Fig. 3. Analyzing Precision
cannot be detected by signature-based intrusion detection
systems since no pattern is yet available. The signature to TABLE II. PRECISION ANALYSIS
identify the invader is already there in this method. Automatic
S.No. Epochs XGBoost (%) MLIDS (%)
creation of a misuse detection approach results in more
complex and accurate work than hand execution. This is going 1. 50 90.86 94.26
to An alarm response or notice should be provided to the 2. 75 89.54 94.39
appropriate authorities based on the severity and robustness of 3. 100 89.32 94.19
a signature that is activated inside the system. The data is sent 4. 125 89.67 94.61
to IDS via the data gathering module. The information is saved 5. 150 90.35 94.58
in a file and then examined. When it comes to intrusion 6. 175 90.14 94.66
detection systems, network-based systems gather and modify 7. 200 89.90 94.75
data packets, whereas host-based systems gather information 8. 225 89.87 94.83
like disk utilization and system processes. In order to analyze

280
Authorized licensed use limited to: VIT University. Downloaded on October 26,2024 at 14:14:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Automation and Computation (AUTOCOM)

In order to compare the two models' performance, the 4. 125 87.64 96.47
suggested MLIDS is verified with the more traditional 5. 150 88.32 96.49
XGBoost approach; the resulting precision ratio is shown in 6. 175 89.17 96.57
Figure 4. The same is represented in the following table, 7. 200 88.33 96.64
Table-2 in descriptive manner.
8. 225 88.17 96.72

Fig. 4. PRECISION ANALYSIS Fig. 6. Recall

The following figures Fig.5 and Fig.6 shows the F1-Score V. CONCLUSION
as well as Recall ratios of the suggested MLIDS compared to
An introduction to intrusion detection systems, including
the standard XGBoost method, which was used for cross-
their uses and why they are necessary, is the primary goal of
validation. The same is represented in the following tables,
this article. Finding various types of intrusion detection
Table-3 and Table-4 in descriptive manner.
systems (IDS) and detecting in an internet of things (IoT)
TABLE III. F1-SCORE
context is the whole focus of this article. Today, intrusion
detection systems (IDS) are crucial for the security of both
S.No. Epochs XGBoost (%) MLIDS (%) businesses and their network users. The suggested model,
1. 50 89.67 95.84 MLIDS, specifies security preventative actions. The lifespan
2. 75 89.79 95.67
provides a visual representation of the stages and how they
3. 100 90.34 95.92
4. 125 90.71 95.89
evolved. More obstacles remain to be surmounted. Anomaly
5. 150 91.05 95.93 detection and abuse detection strategies are demonstrated in
6. 175 91.41 95.97 particular, and additional approaches can be utilized.
7. 200 91.78 96.01 Improving classification-based IDS with selective feedback
8. 225 92.15 96.05 techniques and comparing many prominent data mining
algorithms used to IDS are two areas that will be further
researched.
REFERENCES
[1] Anish Halimaa A. and K. Sundarakantham, "Machine Learning Based
Intrusion Detection System", 3rd International Conference on Trends
in Electronics and Informatics, DOI: 10.1109/ICOEI.2019.8862784,
2019.
[2] Lirim Ashiku and Cihan Dagli, "Network Intrusion Detection System
using Deep Learning", Procedia Computer Science,
https://doi.org/10.1016/j.procs.2021.05.025, 2021.
[3] T.J. Nagalakshmi, et al, Machine learning models to detect the
blackhole attack in wireless adhoc network, Materials Today:
Proceedings, Volume 47, Part 1,2021,Pages 235-239,ISSN 2214-
7853,https://doi.org/10.1016/j.matpr.2021.04.129.
[4] Emad E. Abdallah, Wafa’ Eleisah, et al., "Intrusion Detection Systems
using Supervised Machine Learning Techniques: A survey", Procedia
Computer Science, https://doi.org/10.1016/j.procs.2022.03.029, 2022.
[5] A. G, et al "An Intelligent LoRa based Women Protection and Safety
Fig. 5. F1-Score Enhancement using Internet of Things," (I-SMAC), Dharan, Nepal,
2022, pp. 43-48, doi: 10.1109/I-SMAC55078.2022.9987425.
TABLE IV. RECALL [6] Hongyu Liu and Bo Lang, "Machine Learning and Deep Learning
Methods for Intrusion Detection Systems: A Survey", Appl. Sci.,
S.No. Epochs XGBoost (%) MLIDS (%)
https://doi.org/10.3390/app9204396, 2019.
1. 50 89.27 96.27 [7] Ansam Khraisat, Iqbal Gondal, et al., "Survey of intrusion detection
2. 75 89.36 96.14 systems: techniques, datasets and challenges", Cybersecur,
3. 100 89.52 96.31 https://doi.org/10.1186/s42400-019-0038-7, 2019.

281
Authorized licensed use limited to: VIT University. Downloaded on October 26,2024 at 14:14:33 UTC from IEEE Xplore. Restrictions apply.
2024 International Conference on Automation and Computation (AUTOCOM)

[8] Zeeshan Ahmad, Adnan Shahid Khan, et al., "Network intrusion [11] Abhijit D. Jadhav and Vidyullatha Pellakuri, "Intrusion Detection
detection system: A systematic study of machine learning and deep System Using Machine Learning Techniques for Increasing Accuracy
learning approaches", Emerging Telecommunications Technologies, and Distributed & Parallel Approach for Increasing Efficiency", 5th
https://doi.org/10.1002/ett.4150, 2020. International Conference On Computing, Communication, Control
[9] Hongyu Liu and Bo Lang, "Machine Learning and Deep Learning And Automation, DOI: 10.1109/ICCUBEA47591.2019.9128620,
Methods for Intrusion Detection Systems: A Survey", Applied 2019.
Sciences, DOI:10.3390/app9204396, 2019. [12] Musaab Riyadh and Dina Riadh Alshibani, "Intrusion detection system
[10] Jiadong Ren, Jiawei Guo, et al., "Building an Effective Intrusion based on machine learning techniques", Indonesian Journal of
Detection System by Using Hybrid Data Optimization Based on Electrical Engineering and Computer Science,
Machine Learning Algorithms", Security and Communication DOI:10.11591/ijeecs.v23.i2.pp953-961, 2021.
Networks, https://doi.org/10.1155/2019/7130868, 2019.

282
Authorized licensed use limited to: VIT University. Downloaded on October 26,2024 at 14:14:33 UTC from IEEE Xplore. Restrictions apply.