0% found this document useful (0 votes)
12 views

ids final report

The document is a project report on an Intrusion Detection System (IDS) submitted by students Jefferson James C and Moosa Mulaffar MS as part of their Bachelor of Engineering in Computer Science and Engineering. It discusses the importance of IDS in modern network security, detailing its functionality, types, and the integration of machine learning techniques to enhance detection capabilities. The report includes acknowledgments, an abstract, and a structured table of contents outlining various aspects of the project, including system analysis, specifications, and testing.

Uploaded by

jefferjam716
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

ids final report

The document is a project report on an Intrusion Detection System (IDS) submitted by students Jefferson James C and Moosa Mulaffar MS as part of their Bachelor of Engineering in Computer Science and Engineering. It discusses the importance of IDS in modern network security, detailing its functionality, types, and the integration of machine learning techniques to enhance detection capabilities. The report includes acknowledgments, an abstract, and a structured table of contents outlining various aspects of the project, including system analysis, specifications, and testing.

Uploaded by

jefferjam716
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

INTRUSION DETECTION SYSTEM

A PROJECT REPORT

Submitted by

JEFFERSON JAMES C (950520104014)

MOOSA MULAFFAR MS (950520104026)

BACHELOR OF ENGINEERING

IN

COMPUTER SCIENCE AND ENGINEERING

Dr. SIVANTHI ADITANAR COLLEGE OF ENGINEERING,


TIRUCHNEDUR-628 215

ANNA UNIVERSITY: CHENNAI 600 025

MAY 2024
ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report "INTRUSION DETECTION SYSTEM" is


the bonafide work of "JEFFERSON JAMES C (950520104014), MOOSA
MULAFFAR MS (950520104026)" who carried out the Project work under my
supervision.

SIGNATURE SIGNATURE

Dr.G.Wiselin Jiji, M.E., Ph.D., Dr.D .Kesavaraja, M.E., Ph. D.,

PRINCIPAL & SUPERVISOR

HEAD OF DEPARTMENT, ASSOCIATE PROFESSOR,

Department of Computer Department of Computer

Science and Engineering, Science and Engineering,

Dr.Sivanthi Aditanar College Dr. Sivanthi Aditanar College

of Engineering, of Engineering,

Tiruchendur-628215. Tiruchendur-628215.

Submitted to the B.E Project viva-voce examination held on…………………

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

First and foremost, we would like to thank The God Almighty, who by his
abundant grace sustained us to complete the project successfully.

Our sincere thanks to our honorable founder Padmashri Dr. B. Sivanthi


Adithan and our beloved chairman Sri. S. Balasubramanian Adityan for
providing us with an excellent infrastructure and conductive atmosphere for
developing our project

We also thank our respected Principal and Head of Department of Computer


Science and Engineering Dr. G. Wiselin Jiji M.E., Ph.D. for giving us the
opportunity to display our professional skills through this project.

Our heartfelt thanks to our project coordinator, Mrs.S.V.Anandhi M.E.,


(Ph.D)., Assistant professor for the support and advice she has given us through
our project reviews.

We are greatly thankful to our guide Dr. D .Kesavaraja, M.E., Ph. D.


Associate professor of Department of Computer Science and Engineering for her
valuable guidance and motivation, which helped us to complete this project on
time.

We thank all our teaching and non-teaching staff members of the Computer
Science department for their passionate support, for helping us to identify our
flaws and also for the appreciation they gave us in achieving our goal. Also, we
would like to record our deepest gratitude to our parents for their constant
encouragement.

iii
ABSTRACT

An intrusion detection system (IDS) is a critical component of modern


network security. It is designed to monitor network traffic and identify potential
security threats, including unauthorized access, misuse, or other malicious
activities. IDSs work by analysing network traffic and comparing it against a
database of known attack signatures or behaviour patterns. They can be
deployed at various points in a network and can generate alerts or take
automated actions to respond to threats. There are two main types of IDSs:
signature-based and behaviour-based. Signature-based IDSs use a database of
known attack patterns to identify threats, while behaviour-based IDSs use
machine learning and other techniques to analyse network traffic and detect
anomalies. IDSs can be deployed as standalone appliances or integrated into
existing security architectures to provide real-time threat intelligence and
response capabilities. IDSs are essential tools for protecting networks from a
wide range of threats, including malware infections, network breaches, and
insider attacks. However, they can also generate false positives, which can be
time-consuming to investigate, and they require ongoing maintenance and
updates to remain effective against evolving threats. Regular testing and
evaluation of the IDS is important to ensure it is providing adequate protection
against emerging threats. In addition to generating alerts, IDSs can also take
automated actions to respond to threats. However, these automated responses
should be carefully configured and tested to avoid disrupting legitimate traffic
or causing other unintended consequences.

iv
TABLE OF CONTENTS

CHAPTER TITLE PAGE


NO NO
ABSTRACT iv
LIST OF FIGURES vii
LIST OF ABBREVIATIONS viii
1 INTRODUCTION 1
1.1 FEATURE SELECTION 1
1.2 FEATURE ENGINEERING 2
1.3 CLASSIFICATION 3
1.4 MACHINE LEARNING 4
1.5 ENSEMBLE LEARNING 4
1.6 ANOMALY DETECTION 5
1.7 INTRUSION DETECTION SYSTEM 5
1.8 OBJECTIVES 6
2 LITERATURE SURVEY 7
3 SYSTEM ANALYSIS 16
3.1 EXISTING SYSTEM 16
3.1.1 DRAWBACKS 16
3.2 PROPOSED SYSTEM 17
3.2.1 ADVANTAGES 18
3.3 FEASIBILITY STUDY 18
3.3.1 TECHNICAL FEASIBILITY 18
3.3.2 OPERATIONAL FEASIBILITY 19
3.3.3 ECONOMICAL FEASIBILITY 20
4 SYSTEM SPECIFICATION 21
4.1 HARDWARE CONFIGURATION 21
4.2 SOFTWARE SPECIFICATION 21

v
5 SOFTWARE DESCRIPTION 22
5.1 FRONT END 22
6 PROJECT DESCRIPTION 29
6.1 PROBLEM DEFINITION 29
6.2 MODULE DESCRIPTION 29
6.3 SYSTEM FLOW DIAGRAM 31
6.4 INPUT DESIGN 32
6.5 OUTPUT DESIGN 32
7 SYSTEM TESTING AND 33
IMPLEMENTATION
7.1 SYSTEM TESTING 33
7.2 SYSTEM IMPLEMENTATION 33
8 SYSTEM MAINTENANCE 34
8.1 CORRECTIVE MAINTENANCE 35
8.2 ADAPTIVE MAINTENANCE 35
8.3 PERFECTIVE MAINTENANCE 36
9 CONCLUSION AND FUTURE 37
ENHANCEMENT
10 APPENDICES 38
10.1 SOURCE CODE 38
10.2 SCREEN SHOTS 54
11 REFERENCES 56

vi
LIST OF FIGURES

FIGURE NO FIGURE NAME PAGE NUMBER

1 FEATURE SELECTION 2

2 FEATURE ENGINEERING 3

3 SYSTEM FLOW DIAGRAM 31

vii
LIST OF ABBREVATIONS

ABBREVATIONS
NIDS) NETWORK INTRUSION DETECTION
SYSTEMS
DSL DIGITAL SUBSCRIBER LINE
TDM) TIME DIVISION MULTIPLEXING
EPON), ETHERNET PASSIVE OPTICAL NETWORK
(NG-PON2 NEXT-GENERATION PASSIVE OPTICAL
NETWORK STAGE
WDM WAVELENGTHDIVISION MULTIPLEXING
(WLAN WIRELESS LOCAL ACCESS NETWORK
IOT INTERNET OF THINGS
D-FES DEEP - FEATURE EXTRACTION AND
SELECTION
AWID) AEGEAN WI-FI INTRUSION DATASET
B2B BUSINESS-TO BUSINESS
GB GRADIENT BOOSTING
RF RANDOM FOREST
CNN CONVOLUTIONAL NEURAL NETWORK

viii
CHAPTER 1

1. INTRODUCTION

In the digital age, the security of computer networks and data has become
paramount. With the increasing sophistication of cyber threats and the
interconnectedness of our systems, the need for robust network intrusion
detection systems (NIDS) has never been greater. Intrusion detection plays a
pivotal role in safeguarding organizations, detecting unauthorized access, and
mitigating potential threats to information systems. Traditional intrusion
detection methods often face challenges in adapting to the ever-evolving threat
landscape. To address these challenges and enhance the efficacy of intrusion
detection, we propose a novel approach "Network Intrusion Detection with
Two-Phased Hybrid Ensemble Learning and Automatic Feature Selection.
“This research embarks on a journey to amalgamate cutting-edge techniques
from the realms of machine learning, data science, and cybersecurity. By fusing
the power of ensemble learning and automatic feature selection into a two-
phased detection system, we aim to redefine the landscape of network intrusion
detection.

1.1FEATURE SELECTION

In the ever-expanding digital ecosystem, the security of networks and


information systems has become a paramount concern. The proliferation of
cyber threats, from sophisticated malware to advanced persistent threats,
necessitates the constant evolution of network intrusion detection systems
(NIDS) to safeguard against unauthorized access and malicious activities. At the
heart of effective NIDS lies the selection of the most pertinent data attributes,
commonly referred to as "features." Feature selection is a critical process within
the field of machine learning and data analysis, with the primary goal of
identifying and retaining the most informative attributes while discarding
1
irrelevant or redundant ones. In the context of network intrusion detection, the
judicious selection of features is pivotal in enhancing both the efficiency and
accuracy of the detection process. The motivation behind feature selection in
network intrusion detection is rooted in the quest for improved detection
capabilities and resource optimization. Conventional NIDS often confront
challenges posed by high-dimensional data and noisy features. These challenges
can lead to increased computational demands, suboptimal detection rates, and
heightened susceptibility to false alarms. As such, there exists a compelling
need to streamline the feature space, preserving only those attributes that
significantly contribute to the detection of intrusions, while discarding those
that introduce noise or computational overhead.

Figure 1 FEATURE SELECTION

1.2 FEATURE ENGINEERING

In the realm of machine learning, the quality of data is often paramount to the
success of predictive models and data-driven applications. While machine
learning algorithms can work wonders when presented with vast datasets, the art
of "feature engineering" has emerged as an indispensable process to transform
raw data into a more informative and efficient format. Feature engineering is a
craft, akin to sculpting a raw material into a masterpiece, where the raw material

2
comprises the data and the masterpiece is an accurate and powerful predictive
model. The motivation behind feature engineering lies in the inherent
limitations and idiosyncrasies of raw data. In many real-world applications, data
is messy, incomplete, and often contains extraneous information. Furthermore,
not all data attributes are equally relevant to the task at hand. Feature
engineering seeks to address these challenges by meticulously crafting new
features or transforming existing ones to better capture the underlying patterns
and relationships in the data.

Figure 2 FEATURE ENGINEERING

1.3 CLASSIFICATION

In the vast landscape of machine learning, the task of classification stands as


one of the most fundamental and ubiquitous endeavors. At its core,
classification is about imparting the ability to an algorithm to make sense of the
world by assigning items into predefined categories or classes based on their
inherent characteristics. This ability is not only pervasive but also profoundly
influential, as it underpins a multitude of real-world applications, ranging from
spam email filtering to medical diagnosis and beyond. The motivation behind
classification is deeply rooted in our innate human inclination to categorize and
organize the information-rich environment around us. In a digital context,
classification serves as a potent tool for automating decision-making processes,
discerning patterns in data, and making predictions. It is the keystone of

3
supervised learning, where models are trained on labeled data to replicate the
human ability to classify objects or observations into meaningful groups.

1.4 MACHINE LEARNING

In the digital age, we find ourselves surrounded by an unprecedented deluge of


data. From the clicks we make on the internet to the sensors in our smartphone
and the vast databases that underpin modern businesses, data has become the
lifeblood of the information age. Yet, amidst this data-driven revolution, the
ability to transform raw data into actionable knowledge is what sets the stage
for the most remarkable technological advancements we've ever witnessed. At
the heart of this transformative process stands the science of "Machine
Learning."The motivation behind machine learning is rooted in our desire to
make sense of the vast and complex data ecosystems that define our
contemporary world. It's driven by the recognition that traditional rule-based
programming is often insufficient to address the intricacies of modern problems.
Instead, machine learning endows computers with the capacity to learn from
data, to recognize patterns, and to make decisions, often with astonishing
accuracy. Machine learning algorithms can be trained to predict future events or
outcomes, from stock prices and weather patterns to disease diagnoses.

1.5 ENSEMBLE LEARNING

In the realm of machine learning, the quest for improved predictive accuracy
and robustness has led to the development of ingenious techniques, and at the
forefront of this innovation stands the concept of "Ensemble Learning." Much
like the collective intelligence of a diverse group of individuals can often
outperform a single expert, ensemble learning harnesses the power of multiple
machine learning models to make better predictions, decisions, and
classifications. The motivation behind ensemble learning stems from the
acknowledgment that no single machine learning model is universally optimal

4
for all tasks and datasets. In practice, different algorithms excel under different
conditions, and they may be more adept at capturing specific patterns or
mitigating particular sources of error. Ensemble learning seeks to capitalize on
this diversity by combining the strengths of multiple models, mitigating their
individual weaknesses, and achieving superior performance as a collective. By
aggregating predictions from multiple models, ensemble learning aims to
improve overall predictive accuracy. This is particularly valuable in domains
where high accuracy is paramount, such as medical diagnosis or financial
forecasting.

1.6ANAMOLY DETECTION

In an interconnected world brimming with data, the ability to identify the


extraordinary within the ordinary has emerged as a critical pursuit across
diverse domains. Anomaly Detection, often referred to as outlier detection,
stands as a sentinel in this quest for insight and security. It is the art and science
of distinguishing exceptional occurrences, behaviours, or patterns that deviate
significantly from the expected norm. The ubiquity of data in contemporary
society has given rise to an unprecedented opportunity: the capacity to glean
hidden knowledge from the vast tapestry of information. Anomaly detection
plays an integral role in this endeavour by spotlighting the unusual, the
unexpected, and the potentially impactful amid the constant flow of data.
Whether applied to fault detection in industrial systems, fraud prevention in
financial transactions, or intrusion detection in cybersecurity, the fundamental
goal remains the same: to unearth anomalies that could signify opportunities,
threats, or areas warranting further investigation.

1.7 INTRUSION DETECTION SYSTEM

In today's digitally interconnected world, the protection of sensitive data and


critical infrastructure from cyber threats is of paramount importance. As the

5
complexity and sophistication of malicious activities continue to evolve,
traditional rule-based Intrusion Detection Systems (IDS) have faced limitations
in effectively identifying and mitigating these threats. In response to this ever-
expanding threat landscape, the integration of machine learning techniques
within IDS has emerged as a promising approach. Machine learning, a subset of
artificial intelligence, has the unique capability to adapt and learn from data,
making it well-suited for the dynamic and evolving nature of cyber threats. By
leveraging advanced algorithms and data-driven insights, machine learning-
based IDS aim to bolster cybersecurity defences by detecting anomalous
patterns and malicious behaviours in network traffic, system logs, and other
digital assets.

1.8 OBJECTIVES

1. Develop and implement an Intrusion Detection System (IDS) using the


ADT-SVM algorithm for dynamic cybersecurity threat detection.

2. Explore temporal and thermal correlations in network data to enhance the


adaptability of the IDS.

3. Evaluate IDS performance using key metrics such as Detection Rate (DR)
and False Alarm Rate (FAR) on the KDD dataset.

6
CHAPTER 2

2. LITERATURE REVIEW

2.1 THE EVOLUTION OF ETHERNET PASSIVE OPTICAL


NETWORK (EPON) AND FUTURE TRENDS

Felix Obiteet.al. Has proposed in this paper, the tremendous Internet traffic
growth has confirmed that the telecommunications back bone is moving
aggressively from a time division multiplexing (TDM) orientation to a focus on
Ethernet solution. Ethernet PON, which presents the convergence of low-cost
Ethernet and fiber infrastructures, has taken over the market initially dominated
by Digital Subscriber Line (DSL) and cable modems. It is a new technology that
is simple, inexpensive, and scalable, having the ability to deliver massive data
services to end-users over a single network. This paper reviewed the evolution
of Ethernet Passive Optical Network (EPON), with focus on the current
development process of the future high-data-rate access networks such as Next-
Generation Passive Optical Network Stage 2 (NG-PON2), Wavelength Division
Multiplexing (WDM) PON, and Orthogonal Frequency Division Multiplexing
(OFDM) PON. In addition, the recently concluded 100 Gb Ethernet Passive
Optical Network (100G-EPON) is reviewed with the aim of highlighting the
recent developments in the field. With this comprehensive and up-to-date
review, we equip network operators and interested practitioners to focus on
common priorities and timelines. Another goal of this study is to identify
technical remedies for future investigation. Data traffic is on the increase at an
alarming rate and more users are accessing online, those who are already online
spend more time online and use more bandwidth-intensive applications.
Broadband services permitting high-speed internet transmission is expected to
improve economies. Hence, large bandwidth and mobility are two basic
requirements for future access cable modems are unable to withstand such

7
demand. They were designed on top of previous communication infrastructures
that was not optimized for data traffic. In cable modem systems, just a few RF
channels are dedicated for data, while most networks, in order to support new
and real-time broadband applications. DSL and of the bandwidth is reserved for
servicing legacy analog video. DSL copper systems only allow limited data rate
at required distances due to signal attenuation and crosstalk. It has become
necessary for a new data-centric solution, a technology that would be optimized
for (IP) data congestion. Emerging as the next generation Ethernet passive
optical network is the 10 G-EPON. The technical specification was standardized
by IEEE 802.3av Task Force in September 2009 (10GPON). One of the major
requirements in designing the specification is to develop a platform of co-
existence with the current 1 G EPON Network on the same optical system and
backward compatibility. This paper has described the service trends and
operator requirements that define the evolution of EPON and future trends. It
has proved that optical technologies are evolving continuously in the direction
of higher speeds, higher wavelength capability, and higher loss budgets. A
smart allocation and coexistence strategy of new and existing users is required,
with a logical combination of different types of users such as business and
residential subscribers. WDM-PONs implemented possibly by TDMA and
TDM techniques are unarguably the next stage in PONs evolution. With optical
amplification, they present higher bandwidth per ONU, maximum reach, and
splitting ratios, as compared to EPON and GPON architectures. They can
withstand various fiber topologies and gives additional functionality such as
protection. WDM-PONs if implemented, will give access to new broadband
structure and a broad scale residential applications.

8
2.2 REVISITING WIRELESS INTERNET CONNECTIVITY:
5G VS WI-FI 6

Edward J. Oughtonet.al. has proposed in this paper In recent years, significant


attention has been directed toward the fifth generation of wireless broadband
connectivity known as ‘5G’, currently being deployed by Mobile Network
Operators. Surprisingly, there has been considerably less attention paid to ‘Wi-
Fi 6’, the new IEEE 802.1ax standard in the family of Wireless Local Area
Network technologies with features targeting private, edge-networks. This paper
revisits the suitability of cellular and Wi-Fi in delivering high speed wireless
Internet connectivity. Both technologies aspire to deliver significantly enhanced
performance, enabling each to deliver much faster wireless broadband
connectivity, and provide further support for the Internet of Things and
Machine-to-Machine communications, positioning the two technologies as
technical substitutes in many usage scenarios. We conclude that both are likely
to play important roles in the future, and simultaneously serve as competitors
and complements. We anticipate that 5G will remain the preferred technology
for wide-area coverage, while Wi-Fi 6 will remain the preferred technology for
indoor use, thanks to its much lower deployment costs. However, the traditional
boundaries that differentiated earlier generations of cellular and Wi-Fi are
blurring. Proponents of one technology may argue for the benefits of their
chosen technology displacing the other, requesting regulatory policies that
would serve to tilt the marketplace in their favor. We believe such efforts need
to be resisted, and that both technologies have important roles to play in the
marketplace, based on the needs of heterogeneous use cases. Both technologies
should contribute to achieving the goal of providing affordable, reliable, and
ubiquitously available high-capacity wireless broadband connectivity. Almost in
synchrony we are seeing the roll-out of the next generation of wireless
technologies for both cellular and Wi-Fi connectivity. While there has been

9
much excitement around the world regarding the fifth generation of cellular
technology known as ‘5G’, there is comparable enthusiasm for the next version
of the Institute of Electrical and Electronics Engineers’ (IEEE) 802.11 Wireless
Local Access Network (WLAN) standard, ‘Wi-Fi 6’. Next generation wireless
connectivity technologies are needed to further enable the Herein we revisited
the debate associated with wireless Internet connectivity by providing a new
evaluation of the two main technologies involved in the provision of next
generation wireless broadband: 5G and Wi-Fi 6. Our analysis highlights how
the futures for 5G and Wi-Fi 6 needs to be understood within the larger context
of how earlier generations of cellular and Wi-Fi technologies have shaped the
evolution of wireless networking and what this may mean for the future. First,
in terms of general demand-side trends, data traffic is expected to continue to
grow significantly with an increasing proportion of devices utilizing wireless
connectivity as the first connection point. The COVID-19 pandemic of 2019–
2021 has highlighted the importance of enhanced digital connectivity to support
remote work, education, and social engagement during the global crisis. But
there may also be potentially new trends which could arise out of the shifting
work and social patterns produced by the pandemic. Such changes could have
repercussions for the spatial and temporal usage of wireless broadband
connectivity and the associated economics of each technology.

2.3 INTRUSION DETECTION SYSTEMS IN THE INTERNET


OF THINGS: A COMPREHENSIVE INVESTIGATION

Somayeh Hajiheidariet.al. Has proposed in this system, Recently, a new


dimension of intelligent objects has been provided by reducing the power
consumption of electrical appliances. Daily physical objects have been
upgraded by electronic devices over the Internet to create local intelligence and
make communication with cyberspace. Internet of things (IoT) as a new term in
this domain is used for realizing these intelligent objects. Since the objects in
10
the IoT are directly connected to the unsafe Internet, the resource constraint
devices are easily accessible by the attacker. Such public access to the Internet
causes things to become vulnerable to the intrusions. The purpose is to
categorize the attacks that do not explicitly damage the network, but by
infecting the internal nodes, they are ready to carry out the attacks on the
network, which are named as internal attacks. Therefore, the significance of
Intrusion Detection Systems (IDSs) in the IoT is undeniable. However, despite
the importance of this topic, there is not any comprehensive and systematic
review about discussing and analyzing its significant mechanisms. Therefore, in
the current paper, a Systematic Literature Review (SLR) of the IDSs in the IoT
environment has been presented. Then detailed categorizations of the IDSs in
the IoT (anomaly-based, signature-based, specification-based, and hybrid),
(centralized, distributed, hybrid), (simulation, theoretical), (denial of service
attack, Sybil attack, replay attack, selective forwarding attack, wormhole attack,
black hole attack, sinkhole attack, jamming attack, false data attack) have also
been provided using common features. Then the advantages and disadvantages
of the selected mechanisms are discussed. Finally, the examination of the open
issues and directions for future trends are also provided. Connectivity of
physical things to the Internet makes it possible to control and manage them
from a distance. These devices sense and record client activities, forecast their
future actions and give him/her the useful services. It is anticipated that, in the
next decade, the Internet will be a seamless fabrication of common networks
and related objects. The IoT as a new term in data and information age was
originally introduced by the MIT Auto-ID Center in 1998. It represents a vision
where objects are exclusively identified and available over the Internet. Also,
the real world can be more available through personal computers and networked
devices over the IoT and Internet. US National Intelligence Council (NIC)
believes that IoT has a potential effect on US national power. So, they have
decided to put it on the list of six disruptive civil technologies. This study has

11
proposed a systematic review of IDSs in IoT environments. In a resembling
way, we have reviewed numerous highly developed intrusion detection in the
IoT, clarifying and discussing open issues via an in-depth analysis of over 40
main studies among the basic 324 papers. Based on the accessible literature, the
found papers are categorized into four main categories including anomaly-based
IDS, signature-based IDS, specification based IDS, hybrid IDS and also three
categories including centralized, distributed, and hybrid.

2.4 ENSEMBLE LEARNING FOR INTRUSION DETECTION


SYSTEMS: A SYSTEMATIC MAPPING STUDY AND CROSS-
BENCHMARK EVALUATION

Bayu Adhi Tamaet.al. Has proposed in this system Intrusion detection systems
(IDSs) are intrinsically linked to a comprehensive solution of cyberattacks
prevention instruments. To achieve a higher detection rate, the ability to design
an improved detection framework is sought after, particularly when utilizing
ensemble learners. Designing an ensemble often lies in two main challenges
such as the choice of available base classifiers and combiner methods. This
paper performs an overview of how ensemble learners are exploited in IDSs by
means of systematic mapping study. We collected and analyzed 124 prominent
publications from the existing literature. The selected publications were then
mapped into several categories such as years of publications, publication
venues, datasets used, ensemble methods, and IDS techniques. Furthermore, this
study reports and analyzes an empirical investigation of a new classifier
ensemble approach, called stack of ensemble (SoE) for anomaly-based IDS. The
SoE is an ensemble classifier that adopts parallel architecture to combine three
individual ensemble learners such as random forest, gradient boosting machine,
and extreme gradient boosting machine in a homogeneous manner. The
performance significance among classification algorithms is statistically

12
examined in terms of their Matthews correlation coefficients, accuracies, false
positive rates, and area under ROC curve metrics. Our study fills the gap in
current literature concerning an up-to-date systematic mapping study, not to
mention an extensive empirical evaluation of the recent advances of ensemble
learning techniques applied to Istle ensemble of classifiers; which is hereafter
mentioned as an ensemble learner, has drawn a lot of interest in cybersecurity
research, and in an intrusion detection system (IDS) domain is no exception. An
IDS deals with the proactive and responsive detection of external aggressors
and anomalous operations of the server before they make such a massive
destruction. As of today, a variety number of cyberattacks has been in perilous
situations, placing some organization’s critical infrastructures into risk. A
successful attack may lead to difficult consequences such as but not limited to
financial loss, operational termination, and confidential information disclosure.
Moreover, the larger the organization’s network, the bigger the chance for
attackers to exploit. The complexity of the network may also give rise to
vulnerabilities and other specific threats. Therefore, security mitigation and
protection strategies should be considered mandatory. This study revealed that
there has been a great interest in applying random forest classifier for IDSs.
This is because the implementation of random forest is diverse and almost
effortless to apply for. For instance Caret, Boruta, VSURF ,etc are the example
of random forest implementation in R.

2.5 DEEP ABSTRACTION AND WEIGHTED FEATURE


SELECTION FOR WI-FI IMPERSONATION DETECTION

Muhamad Erza Amina toet.al. Has proposed in this system, The recent advances
in mobile technologies have resulted in IoT-enabled devices becoming more
pervasive and integrated into our daily lives. The security challenges that need
to be overcome mainly stem from the open nature of a wireless medium such as
a Wi-Fi network. An impersonation attack is an attack in which an adversary is
13
disguised as a legitimate party in a system or communications protocol. The
connected devices are pervasive, generating high-dimensional data on a large
scale, which complicates simultaneous detections. Feature learning, however,
can circumvent the potential problems that could be caused by the large-volume
nature of network data. This study thus proposes a novel Deep-Feature
Extraction and Selection (D-FES), which combines stacked feature extraction
and weighted feature selection. The stacked autoencoding is capable of
providing representations that are more meaningful by reconstructing the
relevant information from its raw inputs. We then combine this with modified
weighted feature selection inspired by an existing shallow-structured machine
learner. We finally demonstrate the ability of the condensed set of features to
reduce the bias of a machine learner model as well as the computational
complexity. Our experimental results on a well-referenced Wi-Fi network
benchmark dataset, namely, the Aegean Wi-Fi Intrusion Dataset (AWID), prove
the usefulness and the utility of the proposed D-FES by achieving a detection
accuracy of 99.918% and a false alarm rate of 0.012%, which is the most
accurate detection of impersonation attacks reported in the literature HE rapid
growth of the Internet has led to a significant increase in wireless network
traffic in recent years. According to a worldwide telecommunication
consortium, proliferation of 5G and Wi-Fi networks is expected to occur in the
next decades. By 2020 1 wireless network traffic is anticipated to account for
two thirds of total Internet traffic — with 66% of IP traffic expected to be
generated by Wi-Fi and cellular devices only. Although wireless networks such
as IEEE 802.11 have been widely deployed to provide users with mobility and
flexibility in the form of high-speed local area connectivity, other issues such as
privacy and security have raised. The rapid spread of Internet of Things (IoT)-
enabled devices has resulted in wireless networks becoming to both passive and
active attacks, the number of which has grown dramatically. Examples of these
attacks are impersonation, flooding, and injection attacks. In this study, we

14
presented a novel method, D-FES, which combines stacked feature extraction
and weighted feature selection techniques in order to detect impersonation
attacks in Wi-Fi networks. SAE is implemented to achieve high-level
abstraction of complex and large amounts of Wi-Fi network data. The model-
free properties in SAE and its learnability on complex and large-scale data take
into account the open nature of Wi-Fi networks, where an adversary can easily
inject false data or modify data forwarded in the network.

15
CHAPTER 3

SYSTEM ANALYSIS

3.1 EXISTING SYSTEM

With an increase in the number and types of network attacks, traditional


firewalls and data encryption methods can no longer meet the needs of current
network security. As a result, intrusion detection systems have been proposed to
deal with network threats. The current mainstream intrusion detection
algorithms are aided with machine learning but have problems of low detection
rates and the need for extensive feature engineering. To address the issue of
low detection accuracy, this paper proposes a model for traffic anomaly
detection named a deep learning model for network intrusion detection
(DLNID), which combines an attention mechanism and the bidirectional long
short-term memory (Bi-LSTM) network, first extracting sequence features of
data traffic through a convolutional neural network (CNN) network, then
reassigning the weights of each channel through the attention mechanism, and
finally using Bi-LSTM to learn the network of sequence features. In intrusion
detection public data sets, there are serious imbalance data generally. To address
data imbalance issues, this paper employs the method of adaptive synthetic
sampling (ADASYN) for sample expansion of minority class samples, to
eventually form a relatively symmetric dataset, and uses a modified stacked
auto encoder for data dimensionality reduction with the objective of enhancing
information fusion.

3.1.1 DRAWBACKS

• The DLNID models are computationally expensive to train and deploy.


This is because DLNID models require large amounts of data and
powerful hardware to train effectively.

16
• It requires large amounts of labeled data to train effectively. This data can
be difficult and expensive to collect.

• Additionally, DLNID models are sensitive to the quality of the training


data. If the training data is biased or contains noise, the model will not be
able to learn to detect intrusions accurately.

• These models are black box models, meaning that it is difficult to


understand how they make predictions. This can make it difficult to
debug DLNID models and to identify false positives and false negatives.

3.2 PROPOSED SYSTEM

The proposed system integrates advanced techniques for intrusion detection in


the dynamic cybersecurity landscape. Combining a Probability Model for
baseline behavior analysis, a Link-Anomaly Score computation for identifying
suspicious network connections, Change Point Analysis and Dynamic Time
Warping for detecting shifts in statistical properties and temporal patterns, and
the Adaptive Decision Tree-Support Vector Machine (ADT-SVM) algorithm
for accurate classification, the system offers a comprehensive approach to
identifying potential security threats. By leveraging these modules, the proposed
system aims to enhance the adaptability and effectiveness of intrusion detection,
providing a robust defense mechanism against evolving cyber threats. The
ADT-SVM algorithm, with its ability to learn and categorize diverse data
attributes, and the implementation process also includes the utilization of the
KDD dataset as a benchmark to validate the system's performance. plays a
central role in the proposed system, contributing to a more resilient and
responsive cybersecurity framework.

17
3.2.1 ADVANTAGES

• ADT-SVM's adaptability enhances real-time response to evolving cyber


threats. Integration of temporal and thermal correlations improves
anomaly detection accuracy.
• Machine learning-based IDS increases efficiency by automating intrusion
detection processes.
• Utilizing the KDD dataset as a benchmark provides standardized
performance evaluation.
• Data-driven approach improves the system's ability to categorize and
respond to diverse intrusion scenarios.

3.3 FEASIBILITY STUDY

Preliminary investigation examine project feasibility, the likelihood the system


will be useful to the organization. The main objective of the feasibility study is
to test the Technical, Operational and Economical feasibility for adding new
modules and debugging old running system. All system is feasible if they are
unlimited resources and infinite time. There are aspects in the feasibility study
portion of the preliminary investigation:

• Technical Feasibility
• Operation Feasibility
• Economical Feasibility

3.3.1 TECHNICAL FEASIBILITY


The technical issue usually raised during the feasibility stage of the
investigation includes the following:

• Does the necessary technology exist to do what is suggested?

18
• Do the proposed equipment’s have the technical capacity to hold the data
required to use the new system?
• Will the proposed system provide adequate response to inquiries, regardless
of the number or location of users?
• Can the system be upgraded if developed?
• Are there technical guarantees of accuracy, reliability, ease of access and
data security?
Earlier no system existed to cater to the needs of ‘Secure Infrastructure
Implementation System’. The current system developed is technically feasible.
It is a web based user interface for audit workflow at DB2 Database. Thus it
provides an easy access to the users. The database’s purpose is to create,
establish and maintain a workflow among various entities in order to facilitate
all concerned users in their various capacities or roles. Permission to the users
would be granted based on the roles specified.

Therefore, it provides the technical guarantee of accuracy, reliability and


security. The software and hard requirements for the development of this project
are not many and are already available in-house at NIC or are available as free
as open source. The work for the project is done with the current equipment and
existing software technology. Necessary bandwidth exists for providing a fast
feedback to the users irrespective of the number of users using the system.

3.3.2 OPERATIONAL FEASIBILITY


Proposed projects are beneficial only if they can be turned out into
information system. That will meet the organization’s operating requirements.
Operational feasibility aspects of the project are to be taken as an important part
of the project implementation. Some of the important issues raised are to test the
operational feasibility of a project includes the following: -

• Is there sufficient support for the management from the users?

19
• Will the system be used and work properly if it is being developed and
implemented?
• Will there be any resistance from the user that will undermine the possible
application benefits?
This system is targeted to be in accordance with the above-mentioned
issues. Beforehand, the management issues and user requirements have been
taken into consideration. So there is no question of resistance from the users that
can undermine the possible application benefits.

The well-planned design would ensure the optimal utilization of the computer
resources and would help in the improvement of performance status.

3.3.3 ECONOMIC FEASIBILITY


A system can be developed technically and that will be used if installed
must still be a good investment for the organization. In the economical
feasibility, the development cost in creating the system is evaluated against the
ultimate benefit derived from the new systems. Financial benefits must equal or
exceed the costs.

The system is economically feasible. It does not require any addition


hardware or software. Since the interface for this system is developed using the
existing resources and technologies available at NIC, There is nominal
expenditure and economical feasibility for certain.

20
CHAPTER 4

SYSTEM SPECIFICATION

4.1 HARDWARE REQUIREMENTS

CPU type : Intel core i3 processor

Clock speed : 3.0 GHz

RAM size : 8 GB

Hard disk capacity : 40 GB

Keyboard type : Internet Keyboard

CD -drive type : 52xmax

4.2 SOFTWARE REQUIREMENTS

Operating System : Windows 10

Front End : JAVA

21
CHAPTER 5

SOFTWARE DESCRIPTION

5.1 FRONT END

JAVA

The software requirement specification is created at the end of the analysis task.
The function and performance allocated to software as part of system
engineering are developed by establishing a complete information report as
functional representation, a representation of system behavior, an indication of
performance requirements and design constraints, appropriate validation
criteria.

FEATURES OF JAVA

Java platform has two components:

1. The Java Virtual Machine (Java VM)


2. The Java Application Programming Interface (Java API)
The Java API is a large collection of ready-made software components that
provide many useful capabilities, such as graphical user interface (GUI)
widgets. The Java API is grouped into libraries (packages) of related
components.

The following figure depicts a Java program, such as an application or applet,


that's running on the Java platform. As the figure shows, the Java API and
Virtual Machine insulates the Java program from hardware dependencies.

22
As a platform-independent environment, Java can be a bit slower than
native code. However, smart compilers, well-tuned interpreters, and just-in-time
byte code compilers can bring Java's performance close to that of native code
without threatening portability.

SOCKET OVERVIEW:

A network socket is a lot like an electrical socket. Various plugs around


the network have a standard way of delivering their payload. Anything that
understands the standard protocol can “plug in” to the socket and communicate.

Internet protocol (IP) is a low-level routing protocol that breaks data into
small packets and sends them to an address across a network, which does not
guarantee to deliver said packets to the destination.

Transmission Control Protocol (TCP) is a higher-level protocol that manages to


reliably transmit data. A third protocol, User Datagram Protocol (UDP), sits
next to TCP and can be used directly to support fast, connectionless, unreliable
transport of packets.

CLIENT/SERVER:

A server is anything that has some resource that can be shared.


There are compute servers, which provide computing power; print servers,
which manage a collection of printers; disk servers, which provide networked

23
disk space; and web servers, which store web pages. A client is simply any
other entity that wants to gain access to a particular server.

A server process is said to “listen” to a port until a client connects


to it. A server is allowed to accept multiple clients connected to the same port
number, although each session is unique. To manage multiple client
connections, a server process must be multithreaded or have some other means
of multiplexing the simultaneous I/O.

RESERVED SOCKETS:

Once connected, a higher-level protocol ensues, which is


dependent on which port user are using. TCP/IP reserves the lower, 1,024 ports
for specific protocols. Port number 21 is for FTP, 23 is for Telnet, 25 is for e-
mail, 79 is for finger, 80 is for HTTP, 119 is for Netnews-and the list goes on. It
is up to each protocol to determine how a client should interact with the port.

JAVA AND THE NET:

Java supports TCP/IP both by extending the already established


stream I/O interface. Java supports both the TCP and UDP protocol families.
TCP is used for reliable stream-based I/O across the network. UDP supports a
simpler, hence faster, point-to-point datagram-oriented model.

INETADDRESS:

The Ine Address class is used to encapsulate both the numerical IP


address and the domain name for that address. User interacts with this class by
using the name of an IP host, which is more convenient and understandable than
its IP address. The InetAddress class hides the number inside. As of Java 2,
version 1.4, InetAddress can handle both IPv4 and IPv6 addresses.

24
FACTORY METHODS:

Three commonly used InetAddress factory methods are:

The InetAddress class has no visible constructors. To create an


InetAddress object, user use one of the available factory methods. Factory
methods are merely a convention whereby static methods in a class return an
instance of that class. This is done in lieu of overloading a constructor with
various parameter lists when having unique method names makes the results
much clearer.

1. Static InetAddressgetLocalHost ( ) throws

UnknownHostException

2. Static InetAddressgetByName (String hostName)

throwsUnknowsHostException

3. Static InetAddress [ ] getAllByName (String hostName)

ThrowsUnknownHostException

INSTANCE METHODS:

The InetAddress class also has several other methods, which can be used
on the objects returned by the methods just discussed. Here are some of the
most commonly used.

Boolean equals (Object other) - Returns true if this object has the same
Internet address as other.

1. byte [ ] get Address ( ) - Returns a byte array that represents the


object’s Internet address in network byte order.

25
2. String getHostAddress ( ) - Returns a string that represents the host
address associated with the InetAddress object.

3. String get Hostname ( ) - Returns a string that represents the host name
associated with the InetAddress object.

4. booleanisMulticastAddress ( )- Returns true if this Internet address is a


multicast address. Otherwise, it returns false.

5. String toString ( ) - Returns a string that lists the host name and the IP
address for convenience.

TCP/IP CLIENT SOCKETS:

TCP/IP sockets are used to implement reliable, bidirectional, persistent,


point-to-point and stream-based connections between hosts on the Internet. A
socket can be used to connect Java’s I/O system to other programs that may
reside either on the local machine or on any other machine on the Internet.

There are two kinds of TCP sockets in Java. One is for servers, and the
other is for clients. The Server Socket class is designed to be a “listener,” which
waits for clients to connect before doing anything. The Socket class is designed
to connect to server sockets and initiate protocol exchanges.

The creation of a Socket object implicitly establishes a connection


between the client and server. There are no methods or constructors that
explicitly expose the details of establishing that connection. Here are two
constructors used to create client sockets

Socket (String hostName, intport) - Creates a socket connecting the local


host to the named host and port; can throw an UnknownHostException or
anIOException.

26
Socket (InetAddressipAddress, intport) - Creates a socket using a
preexistingInetAddressobject and a port; can throw an IOException.

A socket can be examined at any time for the address and port
information associated with it, by use of the following methods:

➢ InetAddressgetInetAddress ( ) - Returns the InetAddress associated


with the Socket object.
➢ IntgetPort ( ) - Returns the remote port to which this Socket object is
connected.
➢ IntgetLocalPort ( ) - Returns the local port to which this Socket object
is connected.
Once the Socket object has been created, it can also be examined to gain
access to the input and output streams associated with it. Each of these methods
can throw an IO Exception if the sockets have been invalidated by a loss of
connection on the Net.

Input Streamget Input Stream ( ) - Returns the InputStream associated


with the invoking socket.

Output Streamget Output Stream ( ) - Returns the OutputStream


associated with the invoking socket.

TCP/IP SERVER SOCKETS:

Java has a different socket class that must be used for creating server
applications. The ServerSocket class is used to create servers that listen for
either local or remote client programs to connect to them on published ports.
ServerSockets are quite different form normal Sockets.

When the user create a ServerSocket, it will register itself with the system
as having an interest in client connections.

27
➢ ServerSocket(int port) - Creates server socket on the specified port with a
queue length of 50.
➢ Serversocket(int port, int maxQueue) - Creates a server socket on the
specified portwith a maximum queue length of maxQueue.
➢ ServerSocket(int port, int maxQueue, InetAddress localAddress)-Creates
a server socket on the specified port with a maximum queue length of
maxQueue. On a multihomed host, localAddress specifies the IP address
to which this socket binds.
➢ ServerSocket has a method called accept( ) - which is a blocking call that
will wait for a client to initiate communications, and then return with a
normal Socket that is then used for communication with the client.
URL:

The Web is a loose collection of higher-level protocols and file formats,


all unified in a web browser. One of the most important aspects of the Web is
that Tim Berners-Lee devised a saleable way to locate all of the resources of the
Net. The Uniform Resource Locator (URL) is used to name anything and
everything reliably.

The URLprovides a reasonably intelligible form to uniquely identify or


address information on the Internet. URLs are ubiquitous; every browser uses
them to identify information on the Web.

28
CHAPTER 6

PROJECT DESCRIPTION

6.1 PROBLEM DEFINITION

Traditional network security measures such as firewalls and data encryption are
no longer sufficient to protect networks from the increasing number and types
of cyber-attacks. Intrusion detection systems (IDSs) have been proposed to
address this challenge, but they typically suffer from low detection rates and the
need for extensive feature engineering. Deep learning models have the potential
to overcome these challenges and provide more effective intrusion detection.
Deep learning models can learn complex patterns in network traffic data and
detect new and emerging threats without the need for extensive feature
engineering. However, deep learning models also have several drawbacks,
including high computational cost, data requirements, lack of interpretability,
and vulnerability to adversarial attacks.

6.2 MODULE DESCRIPTION

6.2.1 PROBABILITY MODEL


This module involves the development and application of a probability model
for analyzing network data. The probability model likely assesses the likelihood
of certain events or patterns within the data, providing a foundational
understanding of the baseline behavior. By establishing a probability
distribution, anomalies can be identified by deviating from expected patterns,
enabling the system to flag potentially malicious activities.

6.2.2 COMPUTING THE LINK-ANOMALY SCORE


In this module, the system calculates link-anomaly scores to quantify the
abnormality of network links or connections. The computation involves
29
analyzing various attributes associated with network links, such as traffic
patterns, communication frequencies, or data transfer volumes. A higher link-
anomaly score may indicate suspicious or anomalous behavior, directing the
attention of the intrusion detection system to potential security threats within the
network.

6.2.3 CHANGE POINT ANALYSIS AND DTO


This module focuses on change point analysis and Dynamic Time Warping
(DTO) techniques. Change point analysis aims to identify shifts or deviations in
the statistical properties of the data, signaling potential security incidents. DTO,
on the other hand, involves measuring the similarity between sequences over
time, aiding in the detection of variations in temporal patterns. Integrating these
methods enhances the system's ability to adapt to evolving cyber threats and
identify deviations from normal behavior.

6.2.4 ADT-SVM DETECTION METHOD


The ADT-SVM Detection Method module implements the Adaptive Decision
Tree-Support Vector Machine (ADT-SVM) algorithm for intrusion detection.
This algorithm combines the adaptability of decision trees with the
classification power of support vector machines. The ADT-SVM model is
trained on labeled data, learning to distinguish between normal and anomalous
network behavior. Once trained, it is employed to categorize incoming data
attributes into predefined classes, such as Basic, Content, Traffic, and Host,
facilitating the identification of potential security threats within the network.
The module likely involves fine-tuning and optimizing the ADT-SVM
parameters for optimal detection performance.

30
6.3 SYSTEM FLOW DIAGRAM

Computing
Anomaly Score
Feature
Loading Dataset Preprocessing Based On
Selection
Selected
Features

Detecting Threats
Result Using ADT-SVM
Method

31
6.4 INPUT DESIGN

In the context of the research project focused on cybersecurity and intrusion


detection using the ADT-SVM algorithm, the input design involves carefully
structuring the data fed into the system. The design encompasses the selection
and preprocessing of relevant network datasets, including the KDD dataset as a
benchmark. Emphasis is placed on capturing temporal and thermal correlations
within the data, ensuring that the input reflects the dynamic nature of cyber
threats. Data attributes are categorized into four classes: Basic, Content, Traffic,
and Host, creating a comprehensive representation of network behavior. The
success of the input design is critical for the effective functioning of the ADT-
SVM algorithm, enabling the Intrusion Detection System (IDS) to learn and
adapt to diverse intrusion scenarios while providing a foundation for robust
performance evaluation using metrics such as Detection Rate and False Alarm
Rate.

6.5 OUTPUT DESIGN

The output design in the context of the cybersecurity and intrusion detection
project utilizing the ADT-SVM algorithm involves the systematic presentation
and interpretation of results generated by the Intrusion Detection System (IDS).
The output includes categorizations of incoming data into four classes: Basic,
Content, Traffic, and Host, allowing for a granular understanding of network
behavior. Detection and classification outcomes are presented through
visualizations or reports that highlight instances of identified intrusions and
false alarms.

32
CHAPTER 7

7. SYSTEM TESTING AND IMPLEMENTATION

7.1 SYSTEM TESTING

System testing in the context of the cybersecurity project employing the ADT-
SVM algorithm involves a comprehensive evaluation of the entire Intrusion
Detection System (IDS). This phase verifies the functionality, performance, and
reliability of the IDS by subjecting it to various test cases and scenarios. The
testing process includes assessing the adaptability of the ADT-SVM algorithm
to dynamic cyber threats and ensuring its ability to categorize data attributes
into the designated classes: Basic, Content, Traffic, and Host. The KDD dataset
is utilized to simulate real-world conditions and benchmark the system's
performance. System testing also incorporates the evaluation of key metrics
such as Detection Rate (DR) and False Alarm Rate (FAR) to gauge the accuracy
and efficiency of the IDS in identifying intrusions while minimizing false
positives.

7.2 SYSTEM IMPLEMENTATION

System implementation in the cybersecurity project utilizing the ADT-SVM


algorithm involves the deployment and execution of the developed Intrusion
Detection System (IDS) in a real-world or simulated environment. This phase
encompasses translating the research findings and algorithmic models into a
functional system capable of actively monitoring and analyzing network data.
The ADT-SVM algorithm is integrated into the IDS architecture to classify
incoming data attributes into predefined classes such as Basic, Content, Traffic,
and Host. The implementation process also includes the utilization of the KDD
dataset as a benchmark to validate the system's performance.

33
CHAPTER 8

SYSTEM MAINTENANCE

The objectives of this maintenance work are to make sure that the system gets
into work all time without any bug. Provision must be for environmental
changes which may affect the computer or software system. This is called the
maintenance of the system. Nowadays there is the rapid change in the software
world. Due to this rapid change, the system should be capable of adapting these
changes. In this project the process can be added without affecting other parts of
the system. Maintenance plays a vital role. The system is liable to accept any
modification after its implementation. This system has been designed to favor
all new changes. Doing this will not affect the system’s performance or its
accuracy.

Maintenance is necessary to eliminate errors in the system during its working


life and to tune the system to any variations in its working environment. It has
been seen that there are always some errors found in the system that must be
noted and corrected. It also means the review of the system from time to time.

The review of the system is done for:

o Knowing the full capabilities of the system.

o Knowing the required changes or the additional requirements.

o Studying the performance.

TYPES OF MAINTENANCE:

o Corrective maintenance

o Adaptive maintenance

o Perfective maintenance

34
o Preventive maintenance

8.1 CORRECTIVE MAINTENANCE

Changes made to a system to repair flows in its design coding or


implementation. The design of the software will be changed. The
corrective maintenance is applied to correct the errors that occur
during that operation time. The user may enter invalid file type while
submitting the information in the particular field, then the corrective
maintenance will displays the error message to the user in order to
rectify the error.

Maintenance is a major income source. Nevertheless, even


today many organizations assign maintenance to unsupervised
beginners, and less competent programmers.

The user’s problems are often caused by the individuals who


developed the product, not the maintainer. The code itself may be
badly written maintenance is despised by many software
developersunless good maintenance service is provided, the client will
take future development business elsewhere. Maintenance is the most
important phase of software production, the most difficult and most
thankless.

8.2 ADAPTIVE MAINTENANCE:

It means changes made to system to evolve its functionalities to


change business needs or technologies. If any modification in the modules the
software will adopt those modifications. If the user changes the server then the

35
project will adapt those changes. The modification server work as the existing is
performed.

8.3 PERFECTIVE MAINTENANCE:

Perfective maintenance means made to a system to add new features or


improve performance. The perfective maintenance is done to take some perfect
measures to maintain the special features. It means enhancing the performance
or modifying the programs to respond to the users need or changing needs. This
proposed system could be added with additional functionalities easily. In this
project, if the user wants to improve the performance further then this software
can be easily upgraded.

8.4 PREVENTIVE MAINTENANCE:

Preventive maintenance involves changes made to a system to reduce the


changes of features system failure. The possible occurrence of error that might
occur are forecasted and prevented with suitable preventive problems. If the
user wants to improve the performance of any process then the new features can
be added to the system for this project.

36
CHAPTER 9

9. CONCLUSION

In conclusion, the presented cybersecurity framework, incorporating modules


such as the Probability Model, Link-Anomaly Score computation, Change Point
Analysis with Dynamic Time Warping, and the Adaptive Decision Tree-
Support Vector Machine (ADT-SVM) algorithm, constitutes a comprehensive
and adaptive intrusion detection system. By addressing the dynamic challenges
in the cyber threat landscape, this system leverages probabilistic analysis,
anomaly scoring, and machine learning to effectively identify potential security
threats. The integration of advanced techniques and the utilization of the ADT-
SVM algorithm contribute to the system's ability to adapt and learn from
evolving cyber threats. The proposed framework not only offers a multi-faceted
approach to intrusion detection but also emphasizes the importance of continual
adaptation in the face of emerging cybersecurity challenges.

FUTURE WORK

Future work in this domain could focus on refining and extending the proposed
cybersecurity framework to address emerging challenges. Further exploration of
advanced machine learning models, beyond ADT-SVM, could enhance the
system's detection capabilities. Investigating the integration of threat
intelligence feeds and real-time network monitoring technologies could
contribute to a more proactive defense mechanism. Additionally, incorporating
mechanisms for self-learning and adaptation to new attack vectors would be
crucial for staying ahead of evolving threats.

37
CHAPTER 10

APPENDICES

10.1 SOURCE CODE

DECISION TREE.JAVA

package adt;

import java.io.BufferedOutputStream;

import java.io.BufferedReader;

import java.io.DataOutputStream;

import java.io.File;

import java.io.FileInputStream;

import java.io.FileOutputStream;

import java.io.FileReader;

import java.util.ArrayList;

import java.util.Random;

import libsvm.svm;

import libsvm.svm_model;

import weka.classifiers.Evaluation;

import weka.core.Instances;

import weka.core.converters.CSVLoader;

import weka.classifiers.trees.J48;

/**

38
* @author admin

*/

public class DecisionTree

Details dt=new Details();

ArrayList newCls=new ArrayList();

public void construct()

try

CSVLoader csv1=new CSVLoader();

csv1.setSource(new File("train1.csv"));

Instances trdata=csv1.getDataSet();

trdata.setClassIndex(trdata.numAttributes() - 1);

J48 nb=new J48();

nb.buildClassifier(trdata);

CSVLoader csv2=new CSVLoader();

csv2.setSource(new File("test1.csv"));

Instances tedata=csv2.getDataSet();

tedata.setClassIndex(tedata.numAttributes() - 1);

39
Evaluation eval = new Evaluation(trdata);

eval.evaluateModel(nb, trdata);//, null);

for(int i=0;i<tedata.numInstances();i++)

int ind=(int)nb.classifyInstance(tedata.instance(i));

newCls.add(ind);

// int it=(int)tedata.instance(i).classValue();

// int ind=(int)nb.classifyInstance(tedata.instance(i));

// System.out.println(it+" : "+ind);

ocSVM();

System.out.println(eval.toClassDetailsString());

ADTocSVM();

catch(Exception e)

e.printStackTrace();

40
public void ocSVM()

try

SVMData svm1=new SVMData();

svm1.readTrData("train2.csv");

svm1.convertTrData("train1.txt");

SVMTrain svmtr=new SVMTrain();

svmtr.run();

readData("test2.csv");

int i, predict_probability=0;

SVMPredict sm=new SVMPredict();

BufferedReader input = new BufferedReader(new


FileReader("test1.txt"));

DataOutputStream output = new DataOutputStream(new


BufferedOutputStream(new FileOutputStream("Res1.txt")));

svm_model model = svm.svm_load_model("train1.model");

if(predict_probability == 1)

if(svm.svm_check_probability_model(model)==0)

41
{

System.err.print("Model does not support probabiliy estimates\n");

System.exit(1);

else

if(svm.svm_check_probability_model(model)!=0)

System.out.print("Model supports probability estimates, but


disabled in prediction.\n");

String res=sm.predict(input,output,model,predict_probability);

input.close();

output.close();

catch(Exception e)

e.printStackTrace();

public void ADTocSVM()

42
try

SVMData svm1=new SVMData();

svm1.readTrData("train2.csv");

svm1.convertTrData("train1.txt");

SVMTrain svmtr=new SVMTrain();

svmtr.run();

convertData2("test2.csv");

int i, predict_probability=0;

SVMPredict sm=new SVMPredict();

BufferedReader input = new BufferedReader(new


FileReader("newtest1.txt"));

DataOutputStream output = new DataOutputStream(new


BufferedOutputStream(new FileOutputStream("Res1.txt")));

svm_model model = svm.svm_load_model("train1.model");

if(predict_probability == 1)

if(svm.svm_check_probability_model(model)==0)

System.err.print("Model does not support probabiliy estimates\n");

System.exit(1);

43
}

else

if(svm.svm_check_probability_model(model)!=0)

System.out.print("Model supports probability estimates, but


disabled in prediction.\n");

String res=sm.predict(input,output,model,predict_probability);

input.close();

output.close();

catch(Exception e)

e.printStackTrace();

public void readData(String pp)

try

String dSet[][];

int nData[][];

44
ArrayList cls=new ArrayList();

ArrayList clsCnt=new ArrayList();

String colName[];

String colType[];

File fe=new File(pp);

FileInputStream fis=new FileInputStream(fe);

byte data[]=new byte[fis.available()];

fis.read(data);

fis.close();

String sg1[]=new String(data).split("\n");

String col[]=sg1[0].split(",");

String colty[]=sg1[1].split(",");

colName=new String[col.length];

colType=new String[col.length];

for(int i=0;i<col.length;i++)

colName[i]=col[i];

colType[i]=colty[i];

dSet=new String[sg1.length-2][col.length];

45
nData=new int[sg1.length-2][col.length];

for(int i=2;i<sg1.length;i++)

String sg2[]=sg1[i].split(",");

for(int j=0;j<sg2.length;j++)

dSet[i-2][j]=sg2[j]; //org

String c1=sg2[sg2.length-1].trim();

if(!cls.contains(c1))

cls.add(c1);

System.out.println("cls "+cls);

System.out.println("clsCnt "+clsCnt);

System.out.println("dset = "+dSet.length+" : "+dSet[0].length);

for(int i=0;i<colType.length;i++)

if(colType[i].trim().equals("dis"))

46
ArrayList at=new ArrayList();

for(int j=0;j<dSet.length;j++)

//System.out.println("i=== "+i+" : "+j+" = "+dSet[j][i]);

String g1=dSet[j][i].trim();

if(!at.contains(g1))

at.add(g1);

for(int j=0;j<dSet.length;j++)

String g1=dSet[j][i].trim();

nData[j][i]=at.indexOf(g1);

else

for(int j=0;j<dSet.length;j++)

dSet[j][i]=String.valueOf(Math.round(Double.parseDouble(dSet[j][i])));

nData[j][i]=Integer.parseInt(dSet[j][i]);

47
String txt1="";

for(int i=0;i<nData.length;i++)

// String g1="";

String g1=String.valueOf(nData[i][nData[0].length-1]);

for(int j=0;j<nData[0].length-1;j++)

g1=g1+"\t"+nData[i][j];

//g1=g1+nData[i][j]+"\t";

txt1=txt1+g1.trim()+"\n";

System.out.println(txt1);

File fe2=new File("test1.txt");

FileOutputStream fos=new FileOutputStream(fe2);

fos.write(txt1.getBytes());

fos.close();

catch(Exception e)

e.printStackTrace();

48
}

public void convertData2(String pp)

try

String dSet[][];

int nData[][];

ArrayList cls=new ArrayList();

ArrayList clsCnt=new ArrayList();

String colName[];

String colType[];

File fe=new File(pp);

FileInputStream fis=new FileInputStream(fe);

byte data[]=new byte[fis.available()];

fis.read(data);

fis.close();

String sg1[]=new String(data).split("\n");

String col[]=sg1[0].split(",");

String colty[]=sg1[1].split(",");

colName=new String[col.length];

49
colType=new String[col.length];

for(int i=0;i<col.length;i++)

colName[i]=col[i];

colType[i]=colty[i];

dSet=new String[sg1.length-2][col.length];

nData=new int[sg1.length-2][col.length];

for(int i=2;i<sg1.length;i++)

String sg2[]=sg1[i].split(",");

for(int j=0;j<sg2.length;j++)

dSet[i-2][j]=sg2[j]; //org

String c1=sg2[sg2.length-1].trim();

if(!cls.contains(c1))

cls.add(c1);

50
}

System.out.println("cls "+cls);

System.out.println("clsCnt "+clsCnt);

System.out.println("dset = "+dSet.length+" : "+dSet[0].length);

for(int i=0;i<colType.length;i++)

if(colType[i].trim().equals("dis"))

ArrayList at=new ArrayList();

for(int j=0;j<dSet.length;j++)

//System.out.println("i=== "+i+" : "+j+" = "+dSet[j][i]);

String g1=dSet[j][i].trim();

if(!at.contains(g1))

at.add(g1);

for(int j=0;j<dSet.length;j++)

String g1=dSet[j][i].trim();

nData[j][i]=at.indexOf(g1);

51
else

for(int j=0;j<dSet.length;j++)

dSet[j][i]=String.valueOf(Math.round(Double.parseDouble(dSet[j][i])));

nData[j][i]=Integer.parseInt(dSet[j][i]);

String txt1="";

for(int i=0;i<nData.length;i++)

//String g1=String.valueOf(nData[i][nData[0].length-1]);

String g1=newCls.get(i).toString();

//String g1="";

for(int j=0;j<nData[0].length-1;j++)

g1=g1+"\t"+nData[i][j];

//g1=g1+nData[i][j]+"\t";

g1=g1+newCls.get(nData[0].length-1);

txt1=txt1+g1.trim()+"\n";

52
System.out.println(txt1);

File fe2=new File("newtest1.txt");

FileOutputStream fos=new FileOutputStream(fe2);

fos.write(txt1.getBytes());

fos.close();

catch(Exception e)

e.printStackTrace();

53
10.2 SCREEN SHOTS

54
55
CHAPTER 11

REFERENCES

[1] R. Kumar, A. Malik, and V. Ranga, ‘‘An intellectual intrusion detection


system using hybrid hunger games search and remora optimization algorithm
for IoT wireless networks,’’ Knowl.-Based Syst., vol. 256, Nov. 2022, Art. no.
109762.

[2] W. Wang, S. Jian, Y. Tan, Q. Wu, and C. Huang, ‘‘Representation


learningbased network intrusion detection system by capturing explicit and
implicit feature interactions,’’ Comput. Secur., vol. 112, Jan. 2022, Art. no.
102537.

[3] J. Oughton, W. Lehr, K. Katsaros, I. Selinis, D. Bubley, and J. Kusuma,


‘‘Revisiting wireless internet connectivity: 5G vs Wi-Fi 6,’’ Telecomm. Policy,
vol. 45, no. 5, Jun. 2021, Art. no. 102127

[4] B. A. Tama and S. Lim, ‘‘Ensemble learning for intrusion detection systems:
A systematic mapping study and cross-benchmark evaluation,’’ Comput. Sci.
Rev., vol. 39, Feb. 2021, Art. no. 100357.

[5] S. Lei, C. Xia, Z. Li, X. Li, and T. Wang, ‘‘HNN: A novel model to study
the intrusion detection based on multi-feature correlation and temporalspatial
analysis,’’ IEEE Trans. Netw. Sci. Eng., vol. 8, no. 4, pp. 3257–3274, Oct. 2021

[6] Y. Cheng, Y. Xu, H. Zhong, and Y. Liu, ‘‘Leveraging semisupervised


hierarchical stacking temporal convolutional network for anomaly detection in
IoT communication,’’ IEEE Internet Things J., vol. 8, no. 1, pp. 144–155, Jan.
2021.

[7] X. Li, M. Zhu, L. T. Yang, M. Xu, Z. Ma, C. Zhong, H. Li, and Y. Xiang,
‘‘Sustainable ensemble learning driving intrusion detection model,’’ IEEE

56
Trans. Dependable Secure Comput., vol. 18, no. 4, pp. 1591–1604, Jul./Aug.
2021

[8] Y. Zhou, G. Cheng, S. Jiang, and M. Dai, ‘‘Building an efficient intrusion


detection system based on feature selection and ensemble classifier,’’ Comput.
Netw., vol. 174, Jun. 2020, Art. no. 107247.

[9] G. Kumar, K. Thakur, and M. R. Ayyagari, ‘‘MLEsIDSs: Machine learning-


based ensembles for intrusion detection systems—A review,’’ J. Supercomput.,
vol. 76, no. 11, pp. 8938–8971, Nov. 2020

[10] B. A. Tama, L. Nkenyereye, S. M. R. Islam, and K. Kwak, ‘‘An enhanced


anomaly detection in web traffic using a stack of classifier ensemble,’’ IEEE
Access, vol. 8, pp. 24120–24134, 2020.

[11] S. Hajiheidari, K. Wakil, M. Badri, and N. J. Navimipour, ‘‘Intrusion


detection systems in the Internet of Things: A comprehensive investigation,’’
Comput. Netw., vol. 160, pp. 165–191, Sep. 2019.

[12] M. Akbanov, V. G. Vassilakis, and M. D. Logothetis, ‘‘Ransomware


detection and mitigation using software-defined networking: The case of
WannaCry,’’ Comput. Electr. Eng., vol. 76, pp. 111–121, Jun. 2019

[13] J. W. Mikhail, J. M. Fossaceca, and R. Iammartino, ‘‘A semi-boosted


nested model with sensitivity-based weighted binarization for multi-domain
network intrusion detection,’’ ACM Trans. Intell. Syst. Technol., vol. 10, no. 3,
pp. 1–27, May 2019

[14] K. Li, G. Zhou, J. Zhai, F. Li, and M. Shao, ‘‘Improved PSO AdaBoost
ensemble algorithm for imbalanced data,’’ Sensors, vol. 19, no. 6, p. 1476, Mar.
2019

57

You might also like