0% found this document useful (0 votes)
15 views

11

Uploaded by

f20220630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

11

Uploaded by

f20220630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

A Project Report

On

Detection of DoS attack in IoT Devices


by

SRI NITHYA BANDI (2022AAPS2019H)


SREYASH SOMESH MISHRA (2022A8PS0676H)
PRANJAL BHARDWAJ (2022A3PS1858H)
GURUPRIYA D (2022A3PS0560H)
SAI VARUN RAGI (2022B2A31762H)
BHAVYA REDDY SAMA (2022AAPS2026H)

Group 11

Under the supervision of

Prof. Ravikiran Yeleswarapu

Submitted in partial fulfilment of the requirements of


EEE F411: INTERNET OF THINGS

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI


HYDERABAD CAMPUS
(OCT-DEC 2024)
Table of Contents
1. Problem Statement
2. What is DoS attack
3. Types of DoS Attacks
4. DoS vs DDoS Attacks
5. Impact of DoS attacks
6. Experiencing a DoS Attack
7. Data collection of DOS attack
8. SNORT - IDE ( industrial )
9. Detection Techniques on Edge vs.Cloud
10. Detection on Edge vs.Cloud
11. Challenges of Detection on Edge vs.Cloud
12. Dataset information
13. Data pre-processing for NSL-KDD dataset
14. Feature Analysis and Selection
15. Edge model
16. Model Results
17. Suggested methods to improve accuracy and Their Challenges
18. Challenges Faced in Implementing the Model in Real-Time
19. Performance Comparison for Edge Deployment
20. Cloud-based Random Forest model for DoS Attack detection
21. Cloud-based ANN model for DoS Attack detection
22. Justification for employing Random Forest model for cloud-based DoS attack detection
23. Comparison with some other models for Cloud deployment
24. Enhanced Communication Protocols for Edge-to-Cloud Synchronization
25. Introduction to the Small-Big Model Framework
26. Communication Workflow
27. Testing and Evaluation of the Datasets
28. Benefits of the Framework
29. XAI and Blockchain-Powered Edge-to-Cloud System for DoS Mitigation in IoT
30. XAI for Transparent and Trustworthy DoS Detection
31. Blockchain for Secure Data Management and Collaboration
32. Edge-to-Cloud Architecture for Efficient and Scalable DoS Mitigation
33. Implementation Plan for XAI and Blockchain Integration in Edge-to-Cloud DoS
Detection
34. References
35. Contribution of members
Denial of Service (DoS) Attack
A denial-of-service (DoS) attack is a malicious attempt to disrupt or shut down the normal
functioning of a targeted server, service, or network by overwhelming it with a flood of
illegitimate requests that trigger a crash. This causes the target to become slow, unresponsive,
or utterly inaccessible to legitimate users. These malicious endeavors can cripple websites,
disrupt services, and cause significant financial and reputational damage.

Historical Context and Notable Incidents


In the early 2000s, the first major DoS attack targeted Yahoo!, a leading internet portal,
rendering its services inaccessible for nearly an hour. This incident highlighted the vulnerability
of even the most robust systems.

In 2016, the Mirai botnet DDoS attack exploited IoT devices, crippling major websites like
Twitter and Netflix by flooding DNS provider Dyn with traffic. This attack underscored the
growing threat posed by the proliferation of connected devices.

Another notable incident occurred in 2018 when GitHub faced a record-breaking 1.35 Tbps
attack, leveraging Memcached servers to amplify traffic. These historical events illustrate the
evolving tactics and increasing scale of DDoS attacks.

Each incident prompted advancements in defensive measures, from improved traffic filtering
to deploying more sophisticated intrusion detection systems. Understanding these pivotal
moments provides crucial insights into DoS threats' persistent and adaptive nature,
emphasizing the need for continuous innovation in cybersecurity defenses.

DoS vs DDoS Attacks


DoS attacks involve overwhelming a target with traffic from a single source, while distributed
denial of service (DDoS) attacks involve multiple compromised systems flooding the target
simultaneously.

The distribution of hosts that defines a DDoS provides the attacker multiple advantages:

 They can leverage the greater volume of machines to execute a more disruptive attack

 The location of the attack is difficult to detect due to the random distribution of
attacking systems (often worldwide and from otherwise legitimate systems)

 It is more difficult to shut down multiple machines than one

 The true attacking party is challenging to identify, as they are disguised behind many
(mostly compromised) systems

DDoS attacks are challenging to mitigate because blocking one source does not stop the attack.
They require more sophisticated solutions, such as traffic analysis, rate limiting, and using
content delivery networks (CDNs) to distribute and absorb the traffic load.

Sri Nithya Bandi


Types of Denial of Service Attacks
Denial of service (DoS) attacks manifest in various forms, each designed to exploit specific
vulnerabilities within a system. Understanding these attack vectors is vital for developing
resilient cybersecurity strategies.
Buffer Overflow Attacks

The most common denial of service (DoS) attack is the buffer overflow attack, which involves
sending more traffic to a network address than the system is designed to handle. This can
manifest in various forms, including:

 ICMP flood: This attack targets misconfigured network devices by sending spoofed
packets that ping every computer on the targeted network, causing the network to
amplify the traffic. It is also known as the Smurf attack or ping of death.

 SYN flood: In this attack, a request to connect to a server is sent, but the handshake is
never completed. This continues until all open ports are saturated with requests, making
none available for legitimate users to connect to.

Malicious actors exploit buffer overflow vulnerabilities by overloading a buffer with data,
leading to system crashes and unpredictable behavior. Attackers may also inject malicious code
to gain unauthorized access and compromise sensitive information.

Flood Attacks

Attackers overwhelm a network with excessive traffic, disrupting legitimate requests. This
often involves botnets and strains the target's resources, as seen in the 2016 Dyn attack.
Mitigation strategies include rate limiting, traffic analysis, firewalls, content delivery networks,
redundancy, proactive monitoring, and anomaly detection.

Application Layer Attacks


Attackers exploit vulnerabilities in web applications, targeting features like login pages, search
functions, or database queries. These attacks can overwhelm application resources, leading to
slowdowns or crashes. Techniques include HTTP floods and Slowloris attacks.

Protocol Attacks

Attackers exploit weaknesses in network protocols to disrupt services, often targeting TCP/IP
layers:

 SYN flood attacks overwhelm servers and exhaust resources by sending numerous
connection requests without completing the handshake.

 DNS amplification attacks leverage vulnerable DNS servers to amplify traffic, directing
it to the target.

 Smurf attacks misuse ICMP by sending spoofed packets to a network's broadcast


address, causing all devices to flood the victim with responses.

Sri Nithya Bandi


Volumetric Attacks

Attackers inundate networks with massive volumes of traffic, overwhelming bandwidth and
server capacity. Botnets, comprising thousands of compromised devices, generate this flood,
challenging detection and mitigation.

Common tactics include UDP floods, which exploit the connectionless nature of the protocol,
and ICMP floods, which bombard the target with echo requests. These attacks can peak at
terabits per second, crippling even robust infrastructures.
Effective defenses involve deploying robust traffic filtering, leveraging content delivery
networks (CDNs) to absorb excess traffic, and utilizing scrubbing centers to cleanse incoming
data. Constant monitoring and adaptive rate limiting can enhance resilience against these high-
volume onslaughts.

Cloud-Based Attacks
DoS attacks on cloud resources often focus on hypervisor and crypto-jacking.

Hypervisor DoS Attacks:

 How: These attacks exploit vulnerabilities in the hypervisor layer, which manages and
allocates resources to virtual machines (VMs).

 Impact: If successful, the hypervisor can crash, rendering all VMs on that host
inaccessible.

 Result: The entire cloud infrastructure becomes unavailable, affecting services and
users.s

Hypercall Attacks:

 How: Attackers send specially crafted requests to the cloud hypervisor, aiming to
extract information or execute malicious code.

 Impact: If the hypervisor processes these malicious hypercalls, it can lead to resource
exhaustion or system instability.

 Result: VMs may become unresponsive, causing service disruptions.

Hyperjacking:

 How: An attacker installs a rogue hypervisor beneath the original one. The rogue
hypervisor remains undetected, allowing the attacker to gain control of the target
hypervisor and its resources.

 Impact: With control of the hypervisor, the attacker can manipulate the VM's behavior,
consume resources, or launch further attacks.

 Result: Service degradation or complete unavailability, depending on the compromised


VM's role.

Sri Nithya Bandi


Crypto-jacking:

 How: An attacker compromises cloud resources and installs crypto-mining software to


mine crypto-currency

 Impact: Crypto-jacking depletes available resources, such as CPU, RAM, and Network
bandwidth, making a VM unresponsive

 Result: Overloaded systems become unresponsive, service degradation or complete


unavailability

Mechanisms and Tools Used in DoS Attacks


Denial of service (DoS) attacks utilize various mechanisms and tools that can significantly
disrupt services. Still, they can also be mitigated with appropriate security measures, such as
firewalls, intrusion detection systems, rate limiting, and anti-DDoS services. These
mechanisms and tools, when combined, create formidable challenges for cybersecurity
defenses, necessitating advanced detection and mitigation strategies to protect against the
relentless onslaught of DoS attacks.

Botnets and Malware

Cybercriminals use botnets, networks of compromised devices, for large-scale DDoS attacks.
Infected devices bombard targets with overwhelming traffic without their owners knowing.
Malware infiltrates devices through phishing emails, malicious downloads, or unpatched
software. Compromised devices become part of a botnet, controlled remotely by the attacker.
Mirai, a notorious botnet, has taken down major websites with massive traffic floods.

Attack Tools and Scripts

Hackers employ a variety of sophisticated tools and scripts to launch DoS attacks. LOIC (Low
Orbit Ion Cannon) and HOIC (High Orbit Ion Cannon) are popular open-source tools that
enable users to flood targets with HTTP, TCP, or UDP requests. Script kiddies often use these
tools due to their ease of use.

Advanced attackers might deploy custom Python or Perl scripts to exploit specific
vulnerabilities. These scripts can automate the process, launching highly targeted attacks that
bypass traditional defenses. Tools like Metasploit also provide modules for DoS attacks,
allowing attackers to integrate them into broader exploitation frameworks.

Amplification Techniques

Attackers exploit amplification techniques to magnify the volume of traffic directed at a target,
overwhelming its resources. By leveraging protocols like DNS, NTP, and SSDP, they send
small requests with spoofed IP addresses, causing servers to respond with significantly larger
replies to the victim.

This method, known as reflection, can exponentially increase the attack's impact. For example,
a 1-byte request can generate a 100-byte response, creating a 100:1 amplification ratio.

Sri Nithya Bandi


Attackers often combine multiple amplification vectors, making it challenging for defenders to
mitigate the flood of malicious traffic effectively.

Detection and Identification of DoS Attacks


Early detection and response to a denial of service (DoS) attack by your security operations
center (SOC) is critical to business operations. Attackers may attempt to perform a DoS attack
via network exhaustion, abuse of cloud resources, or blocking the availability of targeted
resources to users and services—all of which can and should be detected by your SOC via best-
in-class tools and processes.

Common Indicators of DoS Attacks

Sudden spikes in traffic often signal a DoS attack, overwhelming network resources and
causing service disruptions. Unusual patterns, such as repeated requests from a single IP
address or a surge in incomplete connections, also indicate malicious activity. Degraded system
performance, including slow response times and frequent crashes, further highlights potential
threats.

Monitoring tools that analyze traffic in real time can identify these anomalies and provide
critical insights. Machine learning algorithms enhance detection by recognizing deviations
from normal behavior, enabling quicker responses. Accurate identification of these indicators
is vital for mitigating the impact of DoS attacks and maintaining system integrity.

Traffic Analysis and Monitoring

Real-time traffic analysis helps detect DoS attacks by monitoring data packets for irregularities.
Advanced systems use machine learning to differentiate between legitimate traffic and
potential threats, with automated alerts for immediate response. Effective traffic analysis
detects ongoing attacks and provides valuable data for strengthening defenses against future
threats.

Differentiating Between Legitimate and Malicious Traffic

Machine learning algorithms analyze behavioral patterns to distinguish between normal and
malicious user activity. Legitimate traffic displays consistent, predictable patterns, while
malicious traffic often shows erratic spikes and unusual request types.

Deep packet inspection (DPI) scrutinizes data at a granular level to identify anomalies that
signal potential threats. Whitelisting known IP addresses and employing rate limiting further
refine traffic differentiation.

Prevention and Mitigation Strategies


Effective prevention and mitigation strategies must be in place to defend against DoS attacks
and strengthen and protect systems from the constantly evolving threat landscape of DoS
attacks.

Sri Nithya Bandi


Here are some key defense measures for network and application layers:

 Use deep packet inspection (DPI) to analyze data packets for malicious signatures and
anomalies.

 Implement web application firewalls (WAFs) to filter and monitor HTTP traffic,
blocking harmful requests before they reach the server.

 Utilize intrusion detection systems (IDS) and intrusion prevention systems (IPS) to
detect and prevent suspicious activities in real time.

 Employ Secure Sockets Layer (SSL) encryption to protect data integrity and
confidentiality, making it harder for attackers to intercept and manipulate traffic.

 Integrate machine learning algorithms and artificial intelligence (AI) to identify and
adapt to new attack patterns, enhancing the strength of your defenses against
sophisticated threats.
Rate Limiting and Traffic Filtering

Set rate limits to throttle incoming requests to prevent overwhelming your servers. This
approach helps manage a user's requests within a specific timeframe, effectively mitigating
potential denial of service (DoS) attacks.
Implement traffic filtering to distinguish between legitimate and malicious traffic, using criteria
such as IP reputation and request patterns. By employing these measures, you can ensure
genuine users maintain access while blocking harmful traffic. Real-time monitoring tools can
adjust rate limits and filtering rules dynamically, providing an adaptive defense mechanism
against evolving threats.

Use of Anycast Networks

Deploy anycast networks to distribute traffic across multiple servers, reducing the risk of a
single point of failure. By routing requests to the nearest or least congested server, anycast
enhances load balancing and minimizes latency. This strategy improves user experience and
mitigates the impact of DoS attacks by dispersing malicious traffic.
Incident Response and Recovery Plans

Organizations must establish vigorous incident response and recovery plans to counteract and
recover from DoS attacks swiftly. Rapid identification of attack vectors and immediate
isolation of affected systems are crucial.

Employ automated real-time monitoring and alerting tools to ensure swift detection and
response. Develop a comprehensive recovery strategy that includes data backups, system
redundancies, and predefined communication protocols. Regularly update and test these plans
to adapt to evolving threats.

Sri Nithya Bandi


Organizations can minimize downtime, protect critical assets, and ensure business continuity
despite persistent and sophisticated DoS attacks by maintaining a well-prepared incident
response framework.

DoS Attacks in Cloud systems


Denial of Service (DoS) and Distributed Denial of Service (DDoS) attacks remain significant
threats in cloud environments due to their centralized nature and resource-sharing
characteristics. These attacks aim to disrupt cloud services, overload resources, and deny
legitimate users access. Robust detection mechanisms are crucial to safeguard cloud devices
and services from such threats while maintaining availability and scalability.

Use Cases of Cloud-Based DoS Detection

1. E-commerce Platforms: Protecting online stores from volumetric DDoS attacks


during peak shopping seasons.

2. Streaming Services: Ensuring uninterrupted service by mitigating attacks targeting


bandwidth and server resources.

3. Financial Services: Preventing downtime for online banking systems and transaction
gateways.

4. Healthcare Systems: Safeguarding cloud-hosted electronic medical records and


telemedicine services.

Edge Device DoS Attack Detection


Edge devices, such as IoT devices, smart appliances, and industrial controllers, are essential
components of modern distributed networks. However, their limited resources and exposure to
public networks make them prime targets for Denial of Service (DoS) attacks. DoS attacks can
overwhelm edge devices, causing performance degradation, system crashes, and denial of
legitimate services. Detecting these attacks efficiently in a resource-constrained environment
is a key challenge.

Use Cases

1. Smart Homes:
o Protecting devices like smart speakers, thermostats, and security cameras from
malicious traffic.
2. Industrial IoT (IIoT):

o Safeguarding industrial controllers and sensors from attacks that can disrupt
manufacturing processes.
3. Smart Cities:

Sri Nithya Bandi


o Preventing attacks on critical systems such as traffic lights, surveillance
cameras, and public utilities.

4. Healthcare IoT:

o Ensuring the availability of medical devices, such as patient monitors and


infusion pumps.

Comparison of DoS Detection on Edge Devices vs. Cloud Systems


Factor Edge Device Detection Cloud System Detection
Detection occurs directly Detection occurs in a
Processing Location on the edge device (e.g., centralized or distributed
IoT hubs, gateways). cloud infrastructure.
Higher latency due to
Low latency; real-time communication and
Latency
detection and response. centralized processing
delays.
Limited by device
resources; more devices Highly scalable, leveraging
Scalability
require distributed dynamic cloud resources.
approaches.
High resource availability
Limited CPU, memory, and
Resource Availability for complex computations
power constraints.
and storage.
Lightweight methods like Advanced techniques such
anomaly detection, rule- as deep learning, big data
Detection Techniques
based systems, or compact analytics, and collaborative
ML models. filtering.
Slower response due to
Faster, as actions (e.g.,
Response Time communication overhead
traffic filtering) are local.
with the edge.
Global traffic view allows
Limited to local traffic; less
Traffic Visibility for detecting large-scale
global context.
patterns (e.g., botnets).
Relies on peer-to-peer Centralized data sharing
Collaboration sharing or coordination across multiple nodes for
with cloud systems. collaborative detection.
Energy consumption is less
Energy-efficient detection
Energy Efficiency of a concern due to ample
mechanisms are required.
resources.
Supports complex
Simpler algorithms tailored
Complexity of Detection algorithms due to higher
for resource constraints.
computational power.
Cost-effective, as no Higher operational cost due
Cost extensive infrastructure is to cloud infrastructure
required. usage.

Sri Nithya Bandi


Higher false-positive rates Lower error rates with
False Positives/Negatives due to limited advanced models and
computational analysis. broader context.

DoS Detection Techniques on Edge Devices vs. Cloud Systems


Detection
Edge Device Cloud System
Technique
Anomaly- Uses statistical models to
Monitors traffic for deviations
Based analyze and detect traffic
from normal patterns.
Detection anomalies globally.
Signature- Performs signature matching at
Compares traffic to known attack
Based a global level for sophisticated
signatures.
Detection attacks.
Processes aggregated traffic
Flow-Based Analyzes network traffic flows to
flows from across the entire
Monitoring detect volumetric attacks.
network.
Advanced models like deep
Lightweight models such as
Machine neural networks (DNNs),
decision trees, SVM, or small
Learning (ML) LSTMs, and other big data
CNNs.
algorithms.
Uses more complex rule sets
Rule-Based Applies predefined rules for traffic
based on global threat
Filtering filtering, such as blocking IPs.
intelligence.
Centralized detection utilizing
Collaborative Peer-to-peer detection within a
multiple devices' data for
Detection local network or edge nodes.
broader insights.
Correlates data across global
Traffic Limited to local devices or within
regions to detect distributed
Correlation the same network.
DoS attacks.
Utilizes deep learning for
Rare due to resource constraints;
Deep Learning complex attack detection and
simple models only.
classification.
Uses large-scale statistical
Statistical Basic methods like traffic rate
models to detect both known
Analysis analysis for anomaly detection.
and unknown attack patterns.
Real-Time Delayed due to processing time
Immediate detection and response
Traffic and network communication,
to abnormal behavior.
Monitoring though scalable.

Challenges in DoS Detection on Edge Devices vs. Cloud Systems


Challenge Edge Device Cloud System
Resource Limited CPU, memory, and High computational resources
Constraints power. available.
Low latency needed but Higher latency due to network
Latency
challenging due to resources. transmission delays.

Sri Nithya Bandi


Hard to scale due to device Easily scalable with cloud
Scalability
limitations. resources.
Higher false positives/negatives Higher accuracy using advanced
Accuracy
with simple models. algorithms.
Adaptability to Limited adaptability without Can quickly adapt with cloud-
New Attacks cloud support. based updates.
Network Limited bandwidth for sending Uses high bandwidth but may
Bandwidth data. experience congestion.
Data Hard to aggregate across Easier data aggregation for a
Aggregation devices. global view.
Complexity of Limited to simpler detection Can support complex detection
Models methods. models (e.g., deep learning).
Energy Energy-efficient models Less concern for energy
Efficiency necessary. consumption.

Vulnerable to physical Centralized, more secure but


Security Risks
tampering. cloud-targeted attacks are a risk.

Sri Nithya Bandi


Introduction

DoS (Denial of Service) and DDoS (Distributed Denial of Service) attacks are cyberattacks aimed
at disrupting the normal functioning of a system, server, or network by overwhelming it with
excessive traffic or requests. While a DoS attack originates from a single source, a DDoS attack
leverages multiple compromised devices, often part of a botnet, to launch a coordinated assault.
These attacks exploit vulnerabilities via port numbers to crash servers, degrade performance, or
make services unavailable to legitimate users. Common techniques include TCP SYN Floods, UDP
Floods, Ping of Death, and HTTP request floods, targeting websites, APIs, DNS servers, or other
internet-exposed services. DDoS attacks are particularly challenging to mitigate due to their
distributed nature, making it hard to distinguish between malicious and legitimate traffic.
Motivations range from financial gain and hacktivism to cyberwarfare and personal vendettas.
Effective defense strategies include using firewalls, traffic filtering, rate limiting, CDNs, and
specialized DDoS protection services, alongside proactive measures like traffic monitoring and
incident response plans. And in this report we will also be discussing how SNORT IDE( Intrusion
Detection Engine) works.

By Sai Varun Ragi , 2022B2A31762H


What is a DoS attack and how does it work?

First of all, what is a DOS attack? As I have written - DOS attack is Denial of service attack , in
which a sole hacker either by their own IP address or a spoofed IP address sends a lot of packets in
seconds like 100000 packets per second. Now attached below is a pic of what happens normally.

Now when you try to do a DoS attack which is specifically a TCP syn flooding what happens is
that there is a spoofed IP address here, as we see below spoofed IP may not even exist so when the
target device which is the server wants to send a Syn/Ack request but the spoofed IP will not
accept it so the device is in an half open state. So in this time our IP address sends
data/malware/packets continuously so it will be in a half open state and will continuously be in an
half open state. So when a legitimate user wants a service from the service , what would be
displayed is “Nothing”, it will keep on buffering and even if you open in a new tab it will continue
to do so.

DDoS attacks are very similar, what happens in this case is not 1 computer/hacker but many
hackers sending packets at the same time. So it will be very very hard for the server to handle and
it may even crash. Other times what can happen is that they can steal the data too and do phishing
attacks. So that’s done, now let’s go into the attacking part.

By Sai Varun Ragi , 2022B2A31762H


DOS attacks and DDOS attacks

Now there are various types of DoS,DDoS attacks. We will be performing TCP syn flood attack,
UDP flood and ICMP flooding. In TCP flooding the 3 way handshake never happens , in UDP
flooding the attacker sends a large volume of UDP packets to a random port on the target system,
forcing the system to process and respond to each packet even if there is no application listening
on that port, consuming system resources. Normally, ICMP echo-request and echo-reply messages
are used to ping a network device in order to diagnose the health and connectivity of the device
and the connection between the sender and the device. By flooding the target with request packets,
the network is forced to respond with an equal number of reply packets.

Now the most important things we need are an attacking device and a target device , in our case
the attacking is done via Kali Linux VM and the target device is Metasploitable 2. Now we also
need to measure whether a TCP flood or any flood happens, we measure this using WireShark.

-----------------------------------------------------------------------------------------------------------------

Metasploitable 2
Now why do we need Metasploitable2 . Metasploitable2 is an environment to test whether the
hackers are good at hacking or one can try different types of attacks for their reports like I have
done. So Metasploitable2 is an intentionally vulnerable Linux virtual machine designed for
training, exploit testing, and general target practice. Unlike other vulnerable virtual machines,
Metasploitable2 focuses on vulnerabilities at the operating system and network services layer
instead of custom, vulnerable applications. Attached below you can see the homepage of the
virtual machine and we can find out that the IP address for the vulnerable website testing is
192.168.56.101 . Now we have gotten our target device .

By Sai Varun Ragi , 2022B2A31762H


Kali Linux

Kali Linux (formerly known as BackTrack Linux) is an open-source, Debian-based Linux


distribution which allows users(us) to perform advanced penetration testing and security auditing.
It runs on multiple platforms and is freely available and accessible to both information security
professionals and hobbyists.

This distribution has several hundred tools, configurations, and scripts with industry-specific
modifications that allow users to focus on tasks such as computer forensics, reverse engineering,
and vulnerability detection, instead of dealing with unrelated activities.

This distribution is specifically tailored to the needs of experienced penetration testers, so therefore
all documentation on this site assumes prior knowledge of, and familiarity with, the Linux
operating system in general.

-----------------------------------------------------------------------------------------------------------------

DVWA
DVWA also known as the Damn Vulnerable Web App is the site which we are going to attack.
Damn Vulnerable Web Application (DVWA) is a PHP/MySQL web application that is damn
vulnerable. Its main goal is to be an aid for security professionals to test their skills and tools in a
legal environment, helping web developers better understand the processes of securing web
applications. DVWA is one of the hyperlinks in the homepage of our target IP address which was
192.168.56.101. Attached below is the home page of the IP address,DVWA and the starting page
of the DVWA.

By Sai Varun Ragi , 2022B2A31762H


Wireshark

Wireshark is the world's foremost network protocol analyzer. It lets you see what's happening on
your network at a microscopic level. It is the de facto (and often de jure) standard across many
industries and educational institutions.

We used wireshark to detect the number of packets and analyse whether the DoS attacks are
occurring or not.

-----------------------------------------------------------------------------------------------------------------

Procedure

● Open your Kali Linux terminal and type the following:


→sudo su
# To go the root directory
→hping3 -S 192.168.56.101 -d 50 -p 80 --flood --rand-source
# To initiate TCP flood , to stop it use Ctrl+C
→hping3 -S 192.168.56.101 -2 -d 50 -p 80 --flood --rand-source
#To initiate UCP flood, to stop it use Ctrl+C
→hping3 -S 192.168.56.101 -1 -d 50 -p 80 --flood --rand-source
#To initiate ICMP flood, to stop it use Ctrl+C

● While entering each of the commands, turn on the WireShark and check whether the
attacks are taking place or not.
● Export the data and analyze the data in WireShark

By Sai Varun Ragi , 2022B2A31762H


Data

The picture besides indicates

that what a normal traffic

is like without any DoS attack.

During DoS attack we see

that we are unable to reach

the site as there is a lot of

half openTCP handshakes.

By Sai Varun Ragi , 2022B2A31762H


TCP Flooding

During TCP Flooding 54,071 tcp packets were transmitted from random sources without any 3

way handshakes and the data which was sent was about 50 bytes. Attached below is the structure

of the TCP diagram , the picture of capturing packet data and what each one of them is. One thing

to note is that all the packets are going to 192.168.56.101. Hence it is TCP flooding.

We can see that all the

packets are going to one IP address

Attached to the right is the protocol diagram

of ethernet,ipv4 and tcp protocol.

We see each data and

we also observe the

load is 50 bytes

By Sai Varun Ragi , 2022B2A31762H


UDP Flooding

During TCP Flooding 69,543 UDP packets were transmitted from random sources and the data

which was sent was about 50 bytes. Attached below is the structure of the UDP diagram , the

picture of capturing packet data and what each one of them is.

By Sai Varun Ragi , 2022B2A31762H


ICMP Flooding

During TCP Flooding 48,314 ICMP packets were transmitted from random sources and the data

which was sent was about 50 bytes. Attached below is the structure of the ICMP diagram , the

picture of capturing packet data and what each one of them is.

By Sai Varun Ragi , 2022B2A31762H


Comparison

Attack Type Harmfulness Reasons

UDP Potentially more High amplification potential, harder to detect,

Flooding harmful can saturate bandwidth easily.

TCP Moderate to high Exploits resource-intensive connection

Flooding handling; application-layer floods like HTTP

can mimic legitimate traffic.

IPv4-Based Varies (high Includes a variety of attack types (ICMP

Attacks potential) floods, Smurf attacks, etc.); amplification and

spoofing make IPv4 attacks versatile and

potentially severe.

By Sai Varun Ragi , 2022B2A31762H


SNORT IDE

Snort is an open-source Intrusion Detection and Prevention System (IDPS) developed by Cisco. It
monitors network traffic in real-time to detect and respond to malicious activities or security policy
violations. Snort is widely used for network security due to its flexibility and extensive rule-based
detection capabilities.

Key Features of Snort:

1. Packet Sniffing: Captures and analyzes network packets in real time.


2. Intrusion Detection:

→Uses predefined or custom rules to identify suspicious traffic (e.g., port scans, buffer
overflows).

→Alerts administrators when a potential threat is detected.

3. Intrusion Prevention:

→Can block or mitigate malicious traffic dynamically when configured as an inline


system.

4. Logging and Analysis:

→Logs packets for further analysis.

→Integrates with tools like Wireshark for detailed packet inspection.

5. Extensibility: Supports custom rule creation to adapt to emerging threats.

Snort primarily operates as a signature-based and rule-based detection engine, meaning it relies on
predefined patterns (signatures) or rules to identify threats.

● Signature-Based Detection:

→Matches packets against a database of known attack patterns.

→Effective for detecting well-documented threats but may struggle with new, unknown
attacks.

● Rule-Based Detection:

→Allows users to define custom rules for specific traffic behaviors.

→Flexible but requires expertise to configure effectively.

By Sai Varun Ragi , 2022B2A31762H


Problem Statement
Denial-of-Service (DoS) attacks pose a serious threat to Healthcare IoT (H-IoT) devices,
which often have limited computational resources and depend on cloud-based systems for
data processing. These attacks can overload devices and servers, leading to service
interruptions and increasing the risk of data interception during transmission. Conventional
defense strategies are often inadequate against gradual and evolving threats such as
Distributed Denial of Service (DDoS) attacks, which can escalate rapidly and disrupt
operations.

Proposed Solution: Edge-to-Cloud System for Real-time DoS


Detection
The proposed system leverages an edge-to-cloud architecture to efficiently and accurately
detect Denial-of-Service (DoS) attacks in real time. It integrates a lightweight edge model
designed for rapid detection with a more robust cloud-based verification model, ensuring a
balance between swift responsiveness and high accuracy.

 Edge Model: CNN-LSTM Hybrid for Fast, Local Detection

Primary Objective:
The edge device (e.g., gateway or router) runs a lightweight CNN-LSTM model to
monitor network traffic and detect DoS attacks in real time. It’s optimized for fast
detection with minimal computational load, enabling timely responses.

Key Features:

1. CNN for Feature Extraction: The CNN processes key network traffic features (e.g.,
src_bytes, dst_bytes, serror_rate). Kernels focus on features indicative of DoS
behaviour.
2. LSTM for Time-Series Analysis: The LSTM analyzes traffic patterns over time,
identifying anomalies like sudden SYN errors or traffic surges that signal potential
attacks.
3. Overfitting for Sensitivity: The edge model is overfitted to known DoS patterns,
prioritizing fast detection over precision, which results in more false positives but
fewer missed attacks.

Detection and Response Workflow:

1. No Attack Detected: If no attack is found, the edge device either takes no action or
sends a "no attack" signal for confirmation.
2. Attack Detected: Detected attacks trigger encryption of traffic data, which is then
sent to the cloud for deeper analysis and verification.
 Cloud Model: Heavy Cross-Verification for Final Decision

Primary Objective:
The cloud model cross-verifies flagged traffic from the edge device using
computationally intensive algorithms to accurately confirm DoS attacks before alerting
the user.

Key Features:

1. Decryption of Data: The cloud model securely decrypts the data sent from the edge
device for further analysis.
2. Cross-Validation: It uses advanced machine learning and anomaly detection
algorithms, leveraging historical data to ensure high accuracy.
3. Attack Confirmation: The model filters out false positives and determines if an
actual attack is occurring.

Response Workflow:

1. Verified Attack: If confirmed, an accurate alert is sent to the user for defensive
action.
2. False Alarm: If no attack is found, the cloud logs it as a false alarm and prevents
unnecessary user notifications.

 Communication and Data Encryption


Data Encryption: All flagged traffic is encrypted using lightweight algorithms like AES
to secure transmission from the edge device to the cloud.

Optimized Communication: The edge device only transmits data when suspicious
activity is detected, reducing communication overhead and conserving bandwidth.

 User Alerts and System Monitoring

Attack Alerts: Users receive alerts only when the cloud model confirms a DoS attack,
preventing false positives from overwhelming them.

System Health Monitoring: The edge device may periodically send health reports to
reassure users that the system is functioning normally when no attacks are detected.

Why the edge model is overfitted towards dos attack

In this setup, the edge model is intentionally overfitted to known DoS attack patterns, leading
to:

 Reducing False Negatives: The model's high sensitivity lowers the chance of missed
attacks.
 Increasing False Positives: While this raises the likelihood of normal traffic being
flagged, the cloud model's secondary verification minimizes the impact of these false
positives.
SOLUTION PIPELINE:

Packet Information from third


party tools (WireShark)

CNN+LSTM model deployed on edge


(overfitted towards attack)

DOS attack detected

NO YES

Alert to user “NO ATTACK” Packet information recollected


along with time series.

Packet Information encoded and sent


to cloud

Packet information analyzed by


computationally intensive model

DOS attack detected

NO
YES

Alert to user “NO ATTACK”


Alert to user “ATTACK”
Comparative Analysis Report: Proposed Edge-to-Cloud DoS
Detection System vs. Industry Standard

Introduction

This report provides a comparison between the proposed Edge-to-Cloud DoS detection model
and current industry-standard methods. The proposed system employs a lightweight CNN-
LSTM model on edge devices for fast, local detection of DoS attacks and an ANN-based
verification model in the cloud. This hybrid approach aims to achieve a balance between
responsiveness, resource efficiency, and detection accuracy.

Current Industry Standards

1. Signature-Based Detection: This is the primary method used in Intrusion Detection


Systems (IDS) and firewalls. Signature-based detection relies on known patterns or
"signatures" of attacks, such as specific payloads or traffic patterns that match
previous DoS attack profiles. Systems like Snort or Suricata analyze packet headers
and payloads in real-time and block any traffic that matches these signatures.

2. Anomaly-Based Detection: Anomaly detection is popular in the industry for


identifying potential DoS attacks by monitoring network traffic and identifying
deviations from typical patterns. Tools like Cisco's Intrusion Prevention Systems (IPS)
use threshold-based alerts (e.g., a sudden spike in traffic volume or connection
attempts) to flag potentially malicious traffic. This method is favored because it can
detect unknown DoS patterns by identifying abnormal behavior rather than relying on
predefined signatures.

3. Rate-Limiting and Traffic Shaping: Rate-limiting is another industry-standard


technique to prevent DoS attacks by controlling the volume of traffic allowed from
specific sources. Web servers and firewalls often implement rate-limiting to control
the flow of requests, protecting against volumetric DoS attacks. For example,
Cloudflare’s DDoS protection services use rate-limiting to handle excessive request
bursts.

4. Web Application Firewalls (WAFs): WAFs like those provided by Akamai,


Cloudflare, and AWS Shield can identify and block common DoS attack patterns,
such as high volumes of HTTP requests. These tools inspect incoming traffic for
indicators of DoS attacks, often combining signature- and anomaly-based techniques
with rule-based policies to mitigate attacks.
Sri Nithya Bandi Bhavya Reddy Sama
Pranjal Bhardwaj Sreyash Somesh Mishra
5. Behavioral Analytics: Some newer systems use behavioral analytics, where DoS
detection is based on analyzing user behavior over time to establish patterns. This
approach can spot unusual behavior indicative of a DoS attack, even if it doesn’t
match known signatures. Behavioral analytics tools are often incorporated in SIEM
(Security Information and Event Management) systems, like Splunk, to give a broader
context of network health.

6. Pulse Traffic Analysis is a technique used to detect patterns in network traffic that
involve periodic bursts, often linked to malicious activities like DoS attacks (e.g.,
DNSbomb). It involves analyzing the timing, volume, and frequency of traffic to
identify irregular pulses that deviate from normal baselines. Tools like statistical
models, temporal metrics, and machine learning algorithms enhance detection
accuracy. This approach is crucial in distinguishing between legitimate high-traffic
events and attack-related traffic anomalies, enabling proactive mitigation.

7. Blockchain-Based Defense detection mechanism for DoS and DDoS attacks


leverages the decentralized nature of the blockchain, ensuring real-time monitoring of
traffic across multiple nodes. Each node records traffic data in an immutable ledger,
which is compared against normal traffic patterns to identify anomalies such as
sudden spikes or repetitive requests—common signs of DDoS attacks. Machine
learning models further enhance detection by analyzing traffic patterns and
recognizing deviations. When an attack is detected, smart contracts can trigger
automated defense measures, like rate-limiting or traffic redirection, reducing the
impact. This decentralized, collaborative approach, coupled with tamper-proof logs,
improves detection accuracy and makes the system more resilient to attacks.

Comparison with Proposed Solution

1. Signature-Based Detection
 Comparison: Signature-based systems rely on pre-defined attack patterns,
offering efficient real-time detection of known DoS attacks. However, they
lack adaptability to novel or evolving attack types.
 Advantages of Proposed Solution: The CNN-LSTM model at the edge adapts
to new attacks by learning high-level and sequential features from the NSL-
KDD dataset. The LSTM component enables detection of temporal patterns
beyond static signatures. The cloud-based Random Forest (RF) further
validates edge predictions by leveraging structured tabular data analysis.
 Drawbacks of Signature-Based Detection: Signature-based methods are
limited to known attacks and can be evaded by variations. The proposed edge-
cloud solution dynamically detects both known and unknown patterns,
offering adaptability and resilience.
Sri Nithya Bandi Bhavya Reddy Sama
Pranjal Bhardwaj Sreyash Somesh Mishra
2. Anomaly-Based Detection

 Comparison: Like traditional anomaly-based systems, the CNN-LSTM edge


model identifies deviations from normal traffic. However, it is optimized to
detect DoS patterns specifically, leveraging overfitting to minimize false
negatives while tolerating some false positives. The RF cloud model validates
flagged events, reducing false alarms.
 Advantages of Proposed Solution: Anomaly detection struggles with high data
volumes and false positives. The proposed edge-cloud solution combines
lightweight edge detection with robust Random Forest validation in the cloud,
ensuring precise detection.
 Drawbacks of Anomaly Detection Alone: Pure anomaly-based systems often
overload security teams with alerts in dynamic environments. The cloud-based
RF validation mitigates this issue by confirming anomalies flagged at the edge.

3. Rate-Limiting and Traffic Shaping

 Comparison: Rate-limiting restricts request volumes from IPs, effectively


managing basic volumetric DoS attacks but failing against sophisticated or
low-rate attacks.
 Advantages of Proposed Solution: The CNN-LSTM model goes beyond static
limits by detecting patterns indicative of diverse DoS attack types, including
low-rate and stealthy attacks. The RF model on the cloud further analyzes
flagged patterns for precise attack identification.
 Drawbacks of Rate-Limiting Alone: Rate-limiting may block legitimate traffic
and lacks adaptability. The edge-cloud solution dynamically analyzes feature-
based behaviors, combining the strengths of CNN and RF for a more refined
detection mechanism.

4. Web Application Firewalls (WAFs)

 Comparison: WAFs are versatile, effective for application-layer attacks, and


use both signature- and rule-based methods. However, they may struggle with
novel or complex DoS patterns.
 Advantages of Proposed Solution: Unlike WAFs that require constant updates
and configurations, the CNN-LSTM edge model dynamically learns from
traffic data, adapting to new attack patterns. The RF model in the cloud
provides additional precision by validating edge-detected anomalies based on
detailed tabular analysis.
 Drawbacks of WAFs: WAFs may miss sophisticated DoS attacks outside
predefined rules and add latency. The edge model is lightweight and optimized
for low-power processing, while cloud validation efficiently handles resource-
intensive tasks.
Sri Nithya Bandi Bhavya Reddy Sama
Pranjal Bhardwaj Sreyash Somesh Mishra
5. Behavioral Analytics

1. Comparison: Behavioral analytics identifies deviations by tracking user patterns


over time, requiring extensive historical data and often leading to false positives.
2. Advantages of Proposed Solution: The LSTM component mimics behavioral
analysis by examining sequential traffic patterns, while the RF model in the
cloud refines detections using structured data analysis. This combination
balances sensitivity with accuracy.
3. Drawbacks of Behavioral Analytics Alone: Dependency on large datasets and
high resource demands make real-time detection difficult. The edge-based
CNN-LSTM model enables real-time detection, with RF validation providing
reliable accuracy.

6. Pulse Traffic Analysis

 Comparison: Pulse Traffic Analytics detects bursts of traffic by analyzing


temporal patterns and irregularities in traffic flow, focusing primarily on short-
term anomalies. It’s resource-intensive and prone to false positives.
 Advantages of Proposed Solution: The CNN-LSTM model on the other hand,
integrates anomaly detection (CNN) for feature extraction and temporal behavior
analysis (LSTM) for long-term pattern recognition. This model offers better
accuracy, reduces false positives, and is more adaptable to evolving attack
methods over time, making it more robust and scalable.
 Drawbacks of Pulse Traffic Analysis Alone: Pulse traffic analysis can produce
false positives, flagging legitimate spikes as attacks. It also demands significant
computational resources, especially in high-traffic environments, which may
impact system performance. The setup is often complex, requiring careful tuning
to balance between detection accuracy and avoiding unnecessary alerts.

7. Block chain-based defense

 Comparison: Blockchain-based defense is more suited for decentralized resilience


and post-attack analysis with automated smart contract responses, while CNN-
LSTM excels at real-time anomaly detection and behavioral analysis in network
traffic
 Advantages of Proposed Solution: The CNN-LSTM deep learning-based DoS
detection offers faster, more flexible, and less resource-intensive solutions
compared to blockchain-based defenses, particularly for real-time anomaly
detection and adaptive mitigation of DoS attacks.
 Drawbacks of Block chain-based defense: While blockchain offers strong security
through decentralization and immutability, its high computational costs,
scalability issues, latency, and integration complexity limit its practicality,
especially in real-time or large-scale deployments.
Sri Nithya Bandi Bhavya Reddy Sama
Pranjal Bhardwaj Sreyash Somesh Mishra
Dataset Information

Introduction

The NSL-KDD dataset has emerged as a standard benchmark for network intrusion detection
research. It addresses key limitations of its predecessor, the KDD’99 dataset, which faced
criticism for issues such as high redundancy and skewed record frequency. These shortcomings
led to biased performance evaluation of machine learning models. By mitigating these concerns,
the NSL-KDD dataset provides a more realistic and effective framework for assessing and
benchmarking machine learning models and cybersecurity techniques. This makes it an
invaluable resource for developing robust network intrusion detection systems

Dataset Structure

The dataset consists of labeled network connection records, classified as either "normal" or
specific network attacks. Each connection is described using multiple features that encapsulate
its attributes and behaviors. To facilitate efficient evaluation, the dataset is organized into two
subsets:

· KDDTrain+: This subset contains labeled network connections for model training and
development.
· KDDTest+: This testing set includes labeled connections, with some attack types not
present in the training set. This setup evaluates how well models generalize to unseen
attack types.

Features of the NSL-KDD Dataset

The dataset describes each network connection with 41 features, grouped into three categories:

Basic Features: These describe general properties of each connection, such as:

o Duration: Connection duration in seconds.


o Protocol Type: Protocol used (e.g., TCP, UDP, ICMP).
o Service: Destination network service (e.g., HTTP, FTP).
o Flag: Connection status, signaling conditions like successful or failed attempts.

PRANJAL BHARDWAJ
SREYASH SOMESH MISHRA
Content Features: These analyze the payload of connections to detect specific attack
activities:

o Failed Logins: Number of unsuccessful login attempts.


o File Creation Operations: Commands related to file creation.
o Shell Prompts: Count of shell commands accessed.

Traffic Features: These are derived from statistical observations over time, highlighting
patterns like:

o Count: Number of connections to the same host within a timeframe.


o Serror_rate: Percentage of connections with SYN errors.
o Rerror_rate: Percentage of connections with REJ errors.

Types of Attacks in NSL-KDD

The dataset categorizes network intrusions into four main attack types:

· DoS (Denial-of-Service): Overwhelms network resources with requests (e.g., Smurf,


Neptune).
· Probe: Gathers information for potential future exploitation (e.g., Nmap, Ipsweep).
· R2L (Remote-to-Local): Gains unauthorized access to a remote machine (e.g.,
Guess_Password, Warezclient).
· U2R (User-to-Root): Exploits vulnerabilities to gain superuser privileges (e.g., Buffer
Overflow, Rootkit).

For this study, the focus is on detecting DoS attacks, with other attack types treated as normal
conditions.

Target Label and Class Imbalance

The target label in each connection record identifies the connection as either "normal" or a
specific attack type. One significant challenge posed by the NSL-KDD dataset is its inherent
class imbalance, a common trait in network intrusion datasets. This imbalance indicates that
certain types of attacks, like DoS and Probe, are much more frequent than others. This was used
to the advantage because the edge model trained on this dataset will inherently be better at
detecting dos attacks reducing the number of False negatives.

PRANJAL BHARDWAJ
SREYASH SOMESH MISHRA
Data Preprocessing for NSL-KDD Dataset

Objective of Preprocessing

The primary goal of preprocessing the NSL-KDD dataset was to prepare it for the effective
training of machine learning models, specifically for detecting Denial-of-Service (DoS) attacks.
The process addressed critical steps such as handling categorical data, normalizing numerical
features, mapping attack types, and selecting features to enhance model accuracy and
computational efficiency.

Data Loading and Column Assignment

The raw dataset was loaded, and appropriate column names were assigned to ensure clarity and
correctness in feature identification. These columns included key attributes such as connection
duration, protocol type, service type, connection status (flag), and various traffic statistics, all of
which describe network behaviors.

Handling Categorical Features

Categorical features like protocol_type, service, and flag were transformed into numerical
representations using label encoding. This step was essential for making the data compatible with
machine learning models, which typically require numerical inputs. Each category was mapped
to a unique integer, preserving the underlying information while enabling effective computation.

Target Label Mapping: DoS Attack Classification

The dataset's attack column, originally comprising multiple network intrusion types, was mapped
to a binary classification:

· Normal Traffic: Labeled as 0.


· DoS Attacks: Labeled as 1.

Specific attack types categorized as DoS attacks were:

· back: Floods a host with packets, preventing legitimate use.

PRANJAL BHARDWAJ
SREYASH SOMESH MISHRA
· land: Spoofs source and destination IP addresses to cause denial of service.
· neptune: A SYN flood attack designed to overwhelm network resources.
· pod (Ping of Death): Sends oversized packets to crash the target system.
· smurf: Exploits ICMP echo requests to flood a target with traffic.
· teardrop: Exploits IP fragmentation vulnerabilities to crash systems.

Other types of attacks, such as Remote-to-Local (R2L), User-to-Root (U2R), and Probe attacks,
were considered part of normal traffic for this specific training purpose.

Feature Normalization

Numerical features were normalized using the MinMaxScaler technique to scale values between
0 and 1. This process ensured that features with larger ranges, such as src_bytes and dst_bytes,
did not disproportionately influence model training. Excluding categorical and target columns,
this normalization step enhanced the numerical stability of the model and provided a
standardized feature space for machine learning.

PRANJAL BHARDWAJ
SREYASH SOMESH MISHRA
Feature Analysis and Selection
Methods used:

· Mean analysis

· Plotting features

Objective of Mean Analysis

A mean analysis was conducted to identify features with the most significant differences between
normal traffic and DoS attack traffic. This step aimed to prioritize features for the edge model,
focusing computational resources on high-impact attributes

Findings from Mean Analysis:

Features with the highest differences in order.These values were obtained from subtracting the
normalized mean value of function in normal condition from its mean value in attack condition.
1. srv_serror_rate: 0.731816 selected
2. serror_rate: 0.731626 selected
3. dst_host_serror_rate: 0.731474 selected
4. same_srv_rate: 0.740047 selected
5. dst_host_same_srv_rate: 0.62412 selected
6. dst_host_srv_count: 0.549149411 selected
7. count: 0.28987182 selected
8. dst_host_count: 0.381967844 selected
9. last_flag: 0.012914 selected
10. wrong_fragment: 0.021433333 selected
11. srv_diff_host_rate: 0.14406
12. rerror_rate: 0.050139 selected
13. srv_rerror_rate: 0.05012 selected
14. dst_host_rerror_rate: 0.060276
15. num_failed_logins: 0.00026

PRANJAL BHARDWAJ
16. num_root: 8.43E-05
17. num_compromised: 7.12E-05
18. num_access_files: 0.001044444
19. su_attempted: 0.00125
20. hot: 0.003472728
21. root_shell: 0.0031
22. num_shells: 0.0003
23. num_file_creations: 0.000302326
24. urgent: 0.0001
25. land: 0.0001
26. is_guest_login: 0.0139
27. num_outbound_cmds: 0
28. is_host_login: 0
29. dst_bytes: 3.25E-06
30. src_bytes: 1.26E-05
31. duration: 0.009985751

Those features with “selected” written next to them were selected for training the edge model.

Explanation of different values in normal and attack conditions.

1. Server SYN Error Rate (srv_serror_rate):


· Normal: 0.017885
· DoS Attack: 0.749701
Explanation: This feature reflects the error rate for SYN packets on the server-side.
In normal operations, the server handles incoming connections successfully, resulting
in low error rates. During a DoS attack (such as SYN flood), the server is
overwhelmed with incomplete requests, increasing the error rate significantly.

2. SYN Error Rate (serror_rate):


· Normal: 0.019468
· DoS Attack: 0.751094
Explanation: serror_rate is related to SYN packet failures in the network. Under

PRANJAL BHARDWAJ
normal conditions, SYN packets are typically acknowledged, resulting in a low
serror_rate. During DoS attacks, many SYN packets are sent without completing
the handshake, leading to connection failures and a high serror_rate.

3. Destination Host SYN Error Rate (dst_host_serror_rate):


· Normal: 0.019497
· DoS Attack: 0.750971
Explanation: This feature tracks SYN packet errors on the destination host.
dst_host_serror_rate is typically low under normal conditions because successful
connections occur. However, during DoS attacks, especially SYN floods, a large
number of incomplete SYN requests overwhelm the destination, resulting in higher
dst_host_serror_rate values.

4. Percentage of Connections to the Same Service (same_srv_rate):


· Normal: 0.932395
· DoS Attack: 0.192348
Explanation: same_srv_rate measures the percentage of connections directed to the
same service. In normal traffic, this value is high as users tend to access a few
consistent services. During DoS attacks, attackers often distribute connections across
various services, causing the same_srv_rate to drop.

5. Destination Host Same Service Rate (dst_host_same_srv_rate):


· Normal: 0.746275
· DoS Attack: 0.122155
Explanation: dst_host_same_srv_rate tracks the percentage of connections to the
same service on the destination host. In normal traffic, a high percentage of requests
are directed to the same service. During a DoS attack, attackers distribute requests
across various services, causing this value to decrease significantly.

6. Destination Host Service Count (dst_host_srv_count):


· Normal: 0.651643529
· DoS Attack: 0.102494118
Explanation: This feature represents the number of services on the destination host
that are being accessed. Under normal conditions, there is moderate usage of services.
In a DoS attack, especially when multiple services are targeted, the
dst_host_srv_count drops as attackers attempt to overwhelm the system, accessing
services in an uncoordinated manner.

PRANJAL BHARDWAJ
7. Number of Connections to the Same Host (count):
· Normal: 0.059363014
· DoS Attack: 0.349234834
Explanation: The count feature represents the number of connections made to the
same host. Normal traffic has fewer connections, but during a DoS attack, especially
with a focus on one host, the count of connections drastically increases as attackers
target the same host repeatedly.

8. Destination Host Count (dst_host_count):


· Normal: 0.578181176
· DoS Attack: 0.96014902
Explanation: dst_host_count measures the number of connections directed at a
particular destination host. During normal traffic, the destination host typically
handles a moderate number of requests. However, in a DoS attack, there is a surge in
connections to the destination, which explains the significant increase in
dst_host_count.

9. Last Flag (last_flag):


· Normal: 0.932933333
· DoS Attack: 0.920019048
Explanation: last_flag indicates the final flag in a connection attempt, typically
reflecting whether the connection attempt was successful or terminated. While this
feature is more subtle in its difference between normal and attack conditions, during
DoS attacks, some flags may not be sent properly due to packet loss or incomplete
connection attempts, leading to small differences.

10. Wrong Fragment (wrong_fragment):


· Normal: 0
· DoS Attack: 0.021433333
Explanation: wrong_fragment tracks incorrectly fragmented packets. Under normal
conditions, there are rarely any wrong fragments, as data is properly structured.
During a DoS attack, attackers often fragment packets to bypass filters or cause
disruption, leading to an increase in wrong_fragment.

PRANJAL BHARDWAJ
11. REJ Error Rate (rerror_rate):
· Normal: 0.098998
· DoS Attack: 0.149137
Explanation: rerror_rate measures the rate of rejected connections due to errors. In
normal conditions, some connection requests may be rejected. However, during a
DoS attack, more requests are invalid or malicious, increasing the rerror_rate as the
server rejects those attempts.

12. Server REJ Error Rate (srv_rerror_rate):


· Normal: 0.099931
· DoS Attack: 0.150051
Explanation: Similar to rerror_rate, the srv_rerror_rate reflects the server’s
rejection rate of incoming requests. A higher srv_rerror_rate during a DoS attack
indicates the server’s struggle to handle the influx of invalid requests.

PRANJAL BHARDWAJ
Objective of plotting features.
Visualize feature distributions and mean differences between normal and DoS traffic, enabling
effective feature selection and preprocessing validation. This ensures data quality and guides the
edge model to focus on high-impact attributes, improving accuracy and reducing computational
overhead.
Sample outputs from plotting.

Y axis: Density of samples

X axis: Normalized values of feature

Color coding:

· Red: DOS attack

· Blue: Normal condition

PRANJAL BHARDWAJ
Edge model

CNN+LSTM model was chosen for its ability to efficiently extract critical features through
CNNs, while LSTMs effectively capture temporal patterns in network traffic, enabling early
detection of DoS attacks with minimal computational overhead on resource-constrained edge
devices.

Summary of chosen edge architecture:

1. Efficient Feature Extraction with CNN : The use of Convolutional Neural Networks
(CNNs) in the edge model allows for efficient processing of network traffic data by
leveraging convolutional kernels. These kernels extract critical features such as
abnormal byte rates, error patterns, and traffic surges, which are indicative of potential
Denial-of-Service (DoS) attacks. The feature compression provided by CNNs minimizes
data complexity while retaining essential information, ensuring computational overhead
is reduced. This lightweight design is particularly suited for resource-constrained edge
devices, enabling real-time anomaly detection without overwhelming hardware
capabilities.

2. Temporal Analysis with LSTM : Long Short-Term Memory (LSTM) networks


complement CNNs by analyzing the temporal relationships in network traffic. Unlike
traditional models, LSTMs are adept at capturing sequential dependencies, allowing
them to detect evolving patterns indicative of DoS attacks. This temporal analysis
enables the model to identify threats early, often before an attack reaches its peak. The
proactive detection capability of LSTMs not only enhances the system's accuracy but
also distinguishes transient traffic spikes from sustained attack behaviour, improving
overall reliability.

Steps followed to create the model:

The model leverages a CNN-LSTM hybrid architecture optimized for detecting


Denial-of-Service (DoS) attacks using the NSL-KDD dataset. Since the dataset lacks inherent
temporal information, we had to preprocess it into sequences to simulate time-series data,
allowing the LSTM component to learn temporal patterns.

PRANJAL BHARDWAJ
1. Data Preprocessing and Feature Selection

The training pipeline begins with preprocessing the NSL-KDD dataset to standardize its format
and enhance computational efficiency:

· Feature Selection: Using mean analysis, we identified features with the highest
variance between normal and attack traffic, such as srv_serror_rate, serror_rate, and
dst_host_serror_rate. These features provide critical insights into DoS behavior.
· Scaling Factors: Selected features were scaled by a factor of 1.5 to emphasize their
importance during training while still considering all available features. This
approach ensures that critical features have a stronger influence on the model without
discarding other potentially useful attributes.
· Sliding Window Technique: To simulate sequential behavior, the data was formatted
into overlapping sequences of five rows. Each sequence represents a
pseudo-time-series input, enabling the LSTM component to analyze temporal patterns
across connection events.

2. CNN-LSTM Hybrid Model

The architecture combines the strengths of Convolutional Neural Networks (CNNs) and Long
Short-Term Memory (LSTM) networks:

· CNN Component:
o Extracts spatial features by applying convolutional filters to the input sequences.
The CNN reduces dimensionality and computational overhead while
highlighting patterns within feature groups.
o Pooling layers further compress the feature space, ensuring efficient processing
on resource-constrained systems.
· LSTM Component:
o Processes the reduced features from the CNN over the simulated time-series
sequences, capturing temporal dependencies critical for detecting evolving
attack patterns.
o By analyzing changes in features like count and serror_rate across time steps,
the LSTM identifies sequential anomalies indicative of DoS activity.

PRANJAL BHARDWAJ
3. Modification of Loss Function and Class Imbalance

To address the imbalance in the NSL-KDD dataset, where normal traffic significantly outweighs
attack traffic:

· We employed a weighted cross-entropy loss function, assigning higher penalties to false


negatives to prioritize the detection of DoS attacks.
· Weighted sampling during training ensured that attack samples are adequately
represented, enabling the model to generalize better to under-represented classes.

4. Considering Computational Efficiency

Given the edge constraints of real-time DoS detection systems, computational efficiency was a
key consideration:

· Scaling Features: By focusing on a reduced set of high-impact features and scaling


them appropriately, the model avoids unnecessary computation while retaining
critical information.
· Dynamic Input Resizing: The CNN dynamically adjusts to the input dimensions,
ensuring compatibility with various feature sets and sequence lengths without
hardcoding architectural parameters.
· Low Parameter Count: The combination of pooling layers in CNN and the selective
feature set significantly reduces the parameter count, enabling the model to run
efficiently on edge devices.

5. Training Strategy
· Batch Processing: DataLoader efficiently handles training samples in batches, reducing
memory overhead while ensuring stable gradient updates.
· Sliding Window Sampling: By overlapping sequences during preprocessing, the model
learns both static and evolving patterns, improving its generalization to unseen data.
· Regularization: Dropout layers and weight decay are incorporated into the training
process to mitigate overfitting, ensuring the model performs well on test datasets.

PRANJAL BHARDWAJ
Model Results:
The following results were obtained from training the edge model from scratch on nsl.

Results from training data:

These results presented were obtained from training the CNN+LSTM model on the NSL-KDD
dataset for 10 epochs. The training accuracy reached 99.73%, with accuracy consistently
increasing and loss steadily decreasing. This trend indicates that the model is successfully
learning and generalizing well from the data, without overfitting. The improvement in accuracy
coupled with the reduction in loss further supports the model's ability to differentiate between
normal traffic and DoS attacks effectively.

PRANJAL BHARDWAJ
Results from unseen test data:

The results presented were obtained from testing the CNN+LSTM model on unseen test data
from the NSL-KDD dataset. The test accuracy achieved was 92.71%, with a relatively low
number of false negatives (approximately 180), indicating that the model effectively identifies
DoS attacks. The RAM usage for testing 22,540 samples was minimal, at just 4.199 MB,
demonstrating that the model is suitable for deployment on edge devices with limited
computational resources. Additionally, the model's inference time was 0.548 seconds, enabling
real-time responses for DoS attack detection, which is crucial for edge-based applications.
These results confirm that the model not only performs well but is also efficient enough to
operate in real-time environments.

PRANJAL BHARDWAJ
Suggested methods to improve accuracy:

1. Adjusting the Classification Threshold

To reduce false negatives (DoS attacks classified as normal), adjusting the classification
threshold can help. By lowering the threshold, the model becomes more sensitive to potential
DoS attacks, detecting them earlier—even with lower confidence scores. Although this might
increase false positives, it ensures that more attacks are flagged proactively.

2. Enhanced Feature Engineering

Exploring additional time-series features or higher-order statistics, such as rolling window


statistics or autocorrelation, could capture more complex patterns in network traffic. These new
features would provide more information for detecting early-stage DoS attacks, enhancing model
performance.

3. Hyperparameter Tuning

Optimizing hyperparameters such as the number of LSTM layers, CNN filter sizes, and
learning rates using techniques like Grid Search can significantly improve model accuracy.
Cross-validation should be used to ensure the model generalizes well to unseen data and avoids
overfitting.

4. Real-Time Adaptive Learning

Implementing online or incremental learning allows the model to adapt to evolving attack
patterns without retraining from scratch. This can improve the system’s ability to detect new and
emerging DoS attacks in real-time, keeping the model up-to-date.

5. Utilizing Superior Datasets


A better model could be trained on datasets that inherently include time-series information, such
as the Network Intrusion dataset (CIC-IDS-2017). This dataset provides features like
inter-arrival times and flow durations, which align well with LSTM models, enabling the
detection of sequential attack patterns more effectively.

PRANJAL BHARDWAJ
Challenges in Implementing Suggested Improvements

While several methods, including threshold adjustment and real-time adaptive learning, were
considered and tried for enhancing model performance, these solutions were not incorporated
due to limitations in the current implementation environment and device limitations.

Also the NSL-KDD dataset does not contain information in time-series format, which limited
our ability to fully utilize the LSTM component of the model for sequential pattern recognition.
However, LSTM was incorporated by using a sliding window technique to simulate time-series
data, capturing sequential dependencies in network traffic despite the dataset's tabular structure.

I did not train the model on the CIC-IDS2017 dataset as it is too large to be managed efficiently
on my device. However, the modular design of the model and the training methodology ensures
that the same steps can be easily adapted to train a new model on this or other more advanced
datasets

Challenges Faced in Implementing the Model in Real-Time

During the simulation of a DoS attack, data was collected using Wireshark to analyze network
activity. However, the predictive model failed to produce accurate results as the input data
extracted from Wireshark included only a subset of features, such as IP addresses and the
number of bytes transmitted between nodes. The model requires a comprehensive set of 43
features to make reliable predictions, which were not captured in the current data collection
process. Additionally, the model was trained on connection-level information, whereas the data
obtained from Wireshark was at the packet level. This necessitated aggregation of packet-level
data into connection-level metrics, introducing further complexity and potential inaccuracies. To
ensure effective testing and prediction, advanced data collection tools capable of capturing all
necessary features at the appropriate level of granularity are essential.

Performance Comparison for Edge Deployment


This comparison evaluates the suitability of various models specifically for edge deployment.
While their core capabilities, such as feature extraction and pattern recognition, are discussed in
other sections, this section focuses on how well these models perform under the constraints of
edge environments, including computational efficiency, memory usage, and real-time processing
capabilities. This perspective highlights the CNN-LSTM model's advantages in balancing
accuracy with resource efficiency for edge-based DoS attack detection.

PRANJAL BHARDWAJ
Model Architecture Strengths for Edge Limitations for Edge
Deployment Deployment

CNN-LSTM (Proposed) Convolutional Neural - Efficient feature - Preprocessing to


Network + Long extraction via CNN with simulate time-series data
Short-Term Memory reduced computational adds complexity, during
overhead. training.
- LSTM captures - Relies on sufficient
temporal patterns, aiding memory for LSTM
in proactive attack operations.
detection.

Random Forest (RF) Ensemble of Decision - High interpretability for - Memory-intensive and
Trees anomaly detection in unsuitable for real-time
tabular data. edge applications.
- Lacks support for
sequential or temporal
patterns.

Support Vector Kernel-based Classifier - Compact model size for - Poor scalability for
Machine (SVM) small datasets. large datasets.
- Limited adaptability to
edge environments with
high-dimensional data.

Multi-Layer Perceptron Feedforward Neural - Simple architecture - Inefficient for temporal


(MLP) Network with low training or spatial data.
complexity. - Requires extensive
feature engineering to be
effective on the edge.

Gradient Boosting Gradient-based Ensemble - Efficient for - High computational


Machine (GBM) Method structured/tabular overhead during
datasets. inference.
- Can be pruned for - Limited sequential
smaller edge pattern detection.
deployments.

PRANJAL BHARDWAJ
Results from comparison

1. CNN-LSTM Superiority for Edge:


The CNN-LSTM model excels in edge environments by combining the strengths of CNNs and
LSTMs. The CNN efficiently extracts key spatial features, reducing computational overhead,
while the LSTM captures temporal dependencies, enabling early detection of DoS attack
patterns. This hybrid approach ensures real-time response capability while maintaining a
lightweight architecture suitable for resource-constrained edge devices.

2. Challenges with Alternative Models:

a) Traditional Machine Learning Models: While models like Random Forest and
Gradient Boosting Machines perform well with structured datasets, their high
memory and computational requirements make them impractical for edge
deployment.
b) Simpler Neural Networks: Models such as MLPs and SVMs are computationally
efficient but lack the ability to process sequential or temporal data, reducing their
effectiveness in detecting evolving attack patterns.
c) RNN-Based Models: Although RNNs capture sequential dependencies, they often
demand greater computational resources and are prone to gradient-related issues,
limiting their applicability in edge scenarios.

3. Real-Time Processing and Resource Usage:


The CNN-LSTM model achieves a practical balance between accuracy and resource usage. With
a RAM footprint of just 4.199 MB and a processing time of 0.548 seconds for 22,540 samples, it
supports real-time detection and quick response to potential DoS attacks, ensuring reliability in
edge-based operations.

4. Adaptability:
The modular design of the CNN-LSTM model allows it to be easily adapted to other datasets or
evolving threats. This flexibility ensures long-term viability for edge deployments as network
traffic patterns change.

PRANJAL BHARDWAJ
Cloud-Based Random Forest Model for DoS Attack Detection
The cloud-based machine learning model leverages a Random Forest classifier to efficiently
detect Denial-of-Service (DoS) attacks, utilizing the NSL-KDD dataset as its foundation.
It is designed specifically for scalable cloud infrastructures. The model prioritizes accuracy,
robustness, and real-time threat analysis, making it a powerful tool for modern cybersecurity
challenges.
By incorporating strategies for large-scale data processing and resource optimization, the
system ensures seamless integration into distributed environments.

Model Architecture and Key Components


1. Random Forest Classifier
 Purpose: The Random Forest algorithm is an ensemble learning method that
aggregates multiple decision trees to improve accuracy and reduce overfitting. Its
inherent ability to handle tabular data and diverse features makes it ideal for
cybersecurity datasets like NSL-KDD.
 Key Characteristics:
o Tree Ensemble: Consists of 100 decision trees (n_estimators=100), where
each tree independently predicts outcomes. The model aggregates these
predictions using a majority-vote mechanism, boosting its stability and
generalization capacity.
o Feature Importance: The classifier evaluates the significance of each feature,
prioritizing indicators critical to DoS attack detection, such as:
 serror_rate (ratio of connection errors to total connections),
 srv_serror_rate (error rates specific to services),
 dst_host_srv_rerror_rate (error rate across service requests to specific
hosts).

2. Cloud Environment Considerations


 Scalability:
o The model is deployed in a cloud architecture designed to scale seamlessly. It
can process high-throughput data streams, crucial for real-time threat
detection.
o Parallelization capabilities inherent to the Random Forest algorithm align with
distributed processing frameworks in the cloud.

SREYASH SOMESH MISHRA


 Resource Management:
o Real-time CPU and memory resource allocation ensures operational
efficiency, even under fluctuating workloads.
o Elastic resource scaling prevents underutilization or overloading, minimizing
operational costs while maintaining performance.

Training and Adaptation Strategies


1. Handling Imbalanced Data
 Challenge: The NSL-KDD dataset contains a disproportionate number of DoS
samples compared to normal traffic, posing a risk of bias during training.
 Current Strategy:
o The model prioritizes high sensitivity to DoS attacks, favoring false positives
over missed detections to ensure robust threat detection.
 Future Enhancements:
o Techniques such as class weighting, Synthetic Minority Over-sampling
(SMOTE), or adaptive sampling can be introduced to balance learning.

2. Feature Engineering and Selection


 Methodology:
o Using statistical analysis, key features were selected based on their ability to
differentiate between normal and attack traffic.
o Features with low predictive power or high redundancy were excluded to
streamline computation.
 Impact:
o Reduced complexity and faster inference, enhancing the model's real-time
detection capabilities in resource-constrained cloud environments.

SREYASH SOMESH MISHRA


Computational Efficiency
1. Resource Optimization
 Inference Time:
o The cloud-based deployment ensures predictions are generated with minimal
latency, essential for cybersecurity applications.
o Parallelized inference across distributed resources minimizes bottlenecks
during high traffic loads.
 Memory Usage:
o The model was optimized to balance ensemble complexity with memory
efficiency, ensuring scalability without performance degradation.
2. Scalability in Cloud Infrastructure
 Horizontal Scaling:
o The architecture supports dynamic scaling to accommodate increasing data
volumes. Additional cloud resources are allocated as needed, maintaining
consistent performance.
 Load Balancing:
o A built-in load balancing mechanism evenly distributes data processing
across computational nodes, avoiding potential overload scenarios.

Performance Metrics and Outputs


 The model’s predictions, inference times, and memory usage metrics are logged
systematically.
 Outputs are stored in structured CSV files, enabling easy access and analysis of
results for both operational insights and model evaluation.

Results from training data


 Training Accuracy: 0.9997
 Training time: 3.14 seconds
 Memory used for training: 8.91 MB
 Inference time: 0.15 seconds
 Memory used for inference: 0.03 MB

SREYASH SOMESH MISHRA


 Confusion matrix of training data:

Results from testing data


 Testing Accuracy: 0.9832
 Precision: 0.9589
 Recall: 0.9760
 Inference time: 0.09 seconds
 Memory used for inference: 0.95 MB
 Confusion matrix of testing data:

SREYASH SOMESH MISHRA


Conclusion
This Random Forest-based solution offers an effective, scalable approach to DoS attack
detection. Its deployment within a cloud environment combines the algorithm’s robustness
with the flexibility and efficiency of modern distributed systems. While current strategies
focus on ensuring high sensitivity and speed, future iterations can further enhance balance
and adaptability, ensuring continued relevance in evolving cybersecurity landscapes.

Methods to improve test accuracy in Random Forest Model


1. Optimize Hyperparameters
 Increase the Number of Trees (n_estimators): Increasing the number of trees
improves model stability but may increase computation time.
 Max Features (max_features): Experiment with different values for the number of
features considered for each split (e.g., sqrt, log2, or a fraction of total features).
 Tree Depth (max_depth): Constrain or expand the depth of trees to balance bias and
variance.
 Minimum Samples (min_samples_split and min_samples_leaf): Adjust these
parameters to ensure splits capture meaningful patterns without overfitting.

SREYASH SOMESH MISHRA


2. Feature Engineering
 Feature Importance: Identify and remove less significant features.
 Feature Transformation: Standardize or normalize features to improve
interpretability and stability, especially if features vary in scale.
 Interaction Terms: Add interaction terms between features if domain knowledge
suggests potential dependencies.

3. Handle Class Imbalance


 If the dataset is imbalanced:
o Use class weights
o Use over-sampling or under-sampling to balance classes.

4. Cross-Validation
 Use K-Fold cross-validation to evaluate the Random Forest model’s performance
across multiple splits of the dataset, reducing overfitting.

SREYASH SOMESH MISHRA


Cloud-Based ANN Model for DoS Attack Detection
The cloud-based machine learning model incorporates an Artificial Neural Network (ANN)
architecture to efficiently detect Denial-of-Service (DoS) attacks, leveraging the NSL-KDD
dataset for training and evaluation.
The ANN model is optimized for environments where real-time data processing, scalability,
and adaptability are critical. By harnessing the flexibility of neural networks, this approach
addresses the limitations of traditional tabular-data-specific methods, enabling enhanced
detection accuracy and versatility in dynamic cloud-based cybersecurity systems.

Model Architecture and Key Components


1. Artificial Neural Network (ANN)
 Purpose:
o The ANN is a versatile model capable of learning complex, non-linear patterns
within data, making it suitable for detecting intricate and evolving attack
behaviours.
 Key Characteristics:
o Multi-Layer Structure:
The ANN comprises three fully connected layers:
 Input Layer: Matches the dimensionality of the NSL-KDD dataset
features.
 Hidden Layers: Two layers with 64 and 32 neurons, respectively, each
activated by ReLU functions to introduce non-linearity.
 Output Layer: A single neuron with a sigmoid activation function for
binary classification (attack vs. normal).
o Training Efficiency:
The ANN employs binary cross-entropy as the loss function and the Adam
optimizer for efficient and adaptive learning.
o Scalability:
The ANN architecture is designed to balance computational efficiency with
flexibility, enabling deployment in cloud environments handling large-scale
traffic data.
2. Cloud Environment Considerations
 Adaptability:
o The ANN’s ability to learn evolving patterns makes it suitable for dynamic
threat landscapes where attack patterns frequently change.

SREYASH SOMESH MISHRA


 Parallel Processing:
o The model’s inference process is optimized for multi-threaded cloud
infrastructure, enabling real-time classification of high-throughput traffic
streams.

Training and Adaptation Strategies


1. Addressing Dataset Challenges
 Imbalanced Data Handling:
o The NSL-KDD dataset contains an uneven distribution of DoS and normal
samples. The model incorporates techniques like over-sampling and class
weighting to ensure balanced learning and robust detection.
 Normalization:
o All features were normalized using a standard scaler to streamline training and
reduce the risk of gradient instability.
2. Iterative Training
 Optimization:
o The ANN was trained for 50 epochs with a batch size of 128, using early
stopping to prevent overfitting.
 Feature Engineering:
o Key features were selected through statistical methods to improve learning
efficiency and computational speed.

Computational Efficiency
1. Inference Time and Resource Utilization
 Low Latency:
o The ANN processes batches of data with minimal latency, suitable for real-
time applications.
 Memory Usage:
o Compared to Random Forest, the ANN exhibits comparable memory usage
but excels in adaptability to new data patterns.

SREYASH SOMESH MISHRA


2. Scalability in Cloud Infrastructure
 Dynamic Resource Allocation:
o The cloud infrastructure supports dynamic scaling, allowing the ANN to
process increasing volumes of flagged data with consistent performance.
 Load Balancing:
o Automated load balancing mechanisms distribute inference tasks evenly
across computational nodes, ensuring seamless operation.

Performance Metrics and Outputs


Results from training data
 Training Accuracy: 0.9996
 Training time: 264.18 seconds
 Memory used for training: 33.31 MB
 Inference time: 0.05 seconds
 Memory used for inference: 23.90 MB
 Confusion matrix of training data:

SREYASH SOMESH MISHRA


Results from testing data
 Testing Accuracy: 0.9448
 Precision: 0.8325
 Recall: 0.9803
 F1 score: 0.9004
 Inference time: 0.01 seconds
 Memory used for inference: 13.79 MB
 Confusion matrix of testing data:

SREYASH SOMESH MISHRA


Conclusion
The ANN-based solution provides a flexible and efficient method for detecting DoS attacks
in cloud environments. Its ability to handle evolving attack patterns and non-linear data
relationships makes it a valuable component of modern cybersecurity frameworks.

Methods to improve test accuracy in ANN model


1. Enhance Network Architecture
 More Layers: Add additional hidden layers to improve the capacity of the model.
 More Neurons: Increase the number of neurons in each layer to allow the model to
capture complex patterns.
 Activation Functions: Experiment with advanced activation functions like Leaky
ReLU or ELU to address vanishing gradients.

2. Optimize Hyperparameters
 Learning Rate: Use a learning rate scheduler to decay the learning rate during
training.
 Batch Size: Experiment with different batch sizes. Smaller batches can improve
convergence, while larger batches stabilize updates.
 Epochs: Train the model for more epochs while monitoring for overfitting using
validation metrics.

3. Data Augmentation and Preprocessing


 Augmentation: Add noise or perturb features to create synthetic data.
 Normalization: Ensure all features are on the same scale.

SREYASH SOMESH MISHRA


Justification for Employing the Random Forest Model for Cloud-
Based DoS Attack Detection in IoT Networks

The selection of an appropriate machine learning model for detecting Denial-of-Service


(DoS) attacks in IoT-based cloud environments is critical to ensuring robust and cost-
effective cybersecurity.
Among various options, Random Forest (RF) emerged as the superior choice over Artificial
Neural Networks (ANN) based on multiple performance metrics, data compatibility,
operational requirements, and scalability considerations.

1. Model Performance
 Training and Test Accuracy
o Random Forest:
 Training Accuracy: 99.97% (0.9997)
 Test Accuracy: 98.32% (0.9832)
o Artificial Neural Network:
 Training Accuracy: 99.96% (0.9996)
 Test Accuracy: 94.48% (0.9448)

 The Random Forest model's test accuracy of 98.32% outperforms the ANN's 94.48%,
demonstrating its ability to generalize better to unseen data.
 This difference is especially significant in real-world applications, where
generalization is critical for detecting previously unseen attack patterns while
minimizing false negatives.
 High test accuracy also ensures enhanced reliability in detecting malicious activities
across IoT networks.
 Metrics Summary (RF vs ANN)
o Although the ANN demonstrates high recall (indicating a strong ability to
detect DoS attacks effectively), its precision of 83.25% reveals a
comparatively higher rate of false positives. This could lead to unnecessary
alerts or disruptions in a real-time detection system.
o In contrast, the Random Forest model achieves superior precision (95.89%)
and recall (97.60%), striking a better balance between accurately identifying
attacks and minimizing false positives.
o In a cloud environment, where both computational efficiency and predictive
reliability are critical, Random Forest’s ability to reduce false positives
provides a clear operational advantage.

SREYASH SOMESH MISHRA


2. Efficiency in Resource-Constrained Cloud Environments
 Memory and Computational Costs
o Random Forest requires significantly less RAM during inference
compared to ANN, making it more suitable for scalable cloud
infrastructures. In cloud-based IoT networks, where real-time decision-
making is crucial, optimizing resource utilization directly impacts cost-
efficiency.
o The ANN's computational complexity, stemming from backpropagation
and matrix multiplications, leads to higher latency and resource demand,
which may cause bottlenecks under high data traffic.

 Inference Time
o Random Forest’s architecture supports parallel inference across its
decision trees, enabling faster predictions. This feature is particularly
advantageous in cloud deployments, where latency-sensitive applications
demand rapid responses to detected anomalies.

3. Robustness and Scalability


 Robustness to Data Variations: The ensemble approach of Random Forest
inherently resists overfitting and performs well across diverse data distributions.
This characteristic ensures consistent performance when analyzing traffic from
heterogeneous IoT devices with varying network behaviours.

 Cloud Integration: With its ability to process large-scale, high-dimensional data,


Random Forest integrates seamlessly into distributed cloud architectures. The
model's parallelizable nature aligns with modern cloud computing frameworks,
ensuring scalability to handle increasing IoT device connections without
compromising performance.

4. Practical Implications for IoT Networks


 Reducing False Negatives
o For IoT networks, minimizing false negatives (missed detections of
attacks) is crucial, as even a single undetected DoS attack can disrupt
critical services. The Random Forest model’s superior test accuracy and
generalization capabilities directly address this need, providing a reliable
shield against evolving threats.

 Flexibility in Future Enhancements


o The RF model's interpretability allows for straightforward adjustments:
 Feature importance rankings can guide further optimization by
focusing on the most impactful attributes.

SREYASH SOMESH MISHRA


 Future integration with imbalanced data-handling techniques (e.g.,
class weighting or SMOTE) can further enhance performance
without altering the fundamental architecture.

5. Performance in Real-Time IoT Scenarios


 IoT networks often generate vast amounts of data requiring real-time threat
detection. The Random Forest model, with its ability to:
o Scale horizontally in cloud environments,
o Maintain low-latency predictions, and
o Operate cost-efficiently under resource constraints, is better suited than
ANN for the dynamic requirements of IoT cybersecurity.

Conclusion
Based on empirical results and practical considerations, the Random Forest model is the
optimal choice for cloud-based DoS attack detection in IoT networks. Its superior test
accuracy, efficiency with tabular data, low resource demands, and seamless scalability make
it a robust solution for safeguarding IoT environments against the growing threat of DoS
attacks.

SREYASH SOMESH MISHRA


Comparison with some other models for Cloud deployment

Model Architecture Strengths for Cloud Limitations for


Deployment Cloud Deployment
Random Forest - Ensemble of decision - Efficient for tabular - Not suitable for
(proposed) trees data image or sequential
- Combines results via - Handles categorical data
averaging (regression) and numerical data - Performance may
or majority voting well degrade with high-
(classification) - Less sensitive to dimensional data
overfitting - Requires
- Scales easily in significant memory
distributed cloud for large ensembles
environments
Artificial - Fully connected layers - Handles complex, - Requires extensive
Neural - Input layer → Hidden non-linear preprocessing (e.g.,
Network layers (non-linear relationships feature scaling)
(ANN) activations) → Output - Well-suited for cloud - Prone to
layer GPUs/TPUs overfitting
- Adaptable to a - Computationally
variety of problems expensive and may
- Parallelizable across require high-end
multiple nodes cloud infrastructure
Support Vector - Maps data to high- - Works well for - Inefficient for
Machine dimensional feature smaller datasets large datasets
(SVM) space - Effective for linear - Computationally
- Finds optimal and non-linear expensive in high
hyperplane for problems with kernel dimensions
classification/regression trick - Poorly parallelized
- Requires less cloud for distributed cloud
memory compared to systems
ensemble models

Convolutional - Specialized for image - Best for image, - Not suitable for
Neural and spatial data video, and spatial data tabular data
Network - Convolutions extract - Supported by cloud - Training is
(CNN) features → Pooling GPUs/TPUs for high- resource-intensive
layers reduce speed computation (time and compute)
dimensions → Fully - Pre-trained models - Requires advanced
connected layers like ResNet available cloud infrastructure
classify for transfer learning with large-scale
parallelism

SREYASH SOMESH MISHRA


Enhanced Communication Protocols for Edge-to-Cloud
Synchronization

Enhanced communication protocols between edge and cloud systems are essential to address
the growing demands of modern IoT and AI-driven applications. Traditional communication
methods often result in high latency, excessive bandwidth usage, and increased energy
consumption due to the constant transfer of data between edge devices and the cloud..
Enhanced protocols enable selective data transmission, ensuring that only necessary
information is sent to the cloud for processing, reducing communication overhead and
conserving resources. Additionally, they incorporate security features like end-to-end
encryption and integrity checks to safeguard sensitive data, which is crucial in industries
handling confidential information. These protocols strike a balance between edge autonomy
and cloud computation, ensuring scalable, efficient, and secure communication in distributed
systems.

1. Asynchronous Communication Protocol


In real-time IoT applications, using asynchronous communication protocols allows edge
devices to transmit data without pausing ongoing detection processes. This is crucial for
minimizing latency, especially in time-sensitive systems such as healthcare IoT and security
networks.
Implementation Strategy:
 Event-Driven Data Transmission: Trigger data transmission based on detection events
rather than a constant stream.
 Non-Blocking Communication Channels: Use non-blocking Input/Output (I/O)
mechanisms to prevent data transmission from interfering with ongoing edge
processes.

2. Low-Bandwidth, Efficient Data Packaging


Instead of transmitting full data packets, only key attributes indicative of potential DoS
attacks are packaged. This includes selected features such as serror_rate, src_bytes, dst_bytes,
and other high-variance traffic characteristics.
Use algorithms like LZ4 or Zstandard, which offer fast compression and decompression
speeds with minimal CPU overhead.
Batch Processing with Adaptive Scheduling: If multiple suspicious events are detected within
a short period, the edge device can batch them and send a single, compressed data packet,
reducing the number of transmissions and conserving bandwidth.

Gurupriya D
3. Security Considerations
Implement lightweight encryption protocols such as AES-128 for data before transmission.
Include checksums or hash-based message authentication codes (HMACs) to verify that data
received by the cloud has not been tampered with during transmission.

4. Real-Time Synchronization Mechanism


The edge device can send periodic "heartbeat" signals to the cloud to confirm that it is active
and functioning as expected. This helps the cloud identify potential communication
disruptions quickly.
Depending on the network conditions and threat level, the edge device can adapt the
frequency and size of its data transmissions. For example, during high network congestion, it
can switch to a lower transmission rate to prevent bandwidth overload.

With the growing adoption of IoT and edge computing, efficient communication between
edge devices and cloud systems is crucial. Traditional models often transmit all data to the
cloud, leading to high bandwidth usage, latency, and energy consumption. The Small-Big
Model Framework addresses these challenges by enabling selective communication, reducing
unnecessary data transmission while maintaining high performance.

Gurupriya D
Introduction to the Small-Big Model Framework
The Small-Big Model Framework is a communication strategy designed to optimize data
transmission and processing between edge devices and cloud systems. This framework
addresses the challenges of traditional edge-cloud architectures, including high bandwidth
usage, latency issues, and limited scalability. By enabling selective communication, the
framework ensures that only critical data is transmitted to the cloud, while simpler tasks are
processed locally on edge devices. This approach significantly reduces communication
overhead while maintaining the accuracy and efficiency of the system, making it ideal for
applications in IoT, surveillance, and industrial automation.

Architecture of the Framework


The framework is built on two main components: a Small Model operating on the edge
device and a Big Model hosted on the cloud. The Small Model is lightweight and optimized
for processing straightforward cases directly on the edge device. This local processing
reduces the dependency on the cloud for all computations, thereby minimizing latency and
conserving bandwidth. When the Small Model encounters a complex or uncertain case, it
flags it for further analysis by the Big Model, which has access to larger computational
resources and more sophisticated algorithms.
At the core of the framework lies the Difficult-Case Discriminator (DCD), which is
responsible for identifying which data needs to be transmitted to the cloud. The DCD applies
a confidence threshold: cases with high-confidence predictions are resolved locally, while
those with lower confidence are offloaded to the cloud for more detailed analysis.

Gurupriya D
Communication Workflow
The Small-Big Model Framework introduces an innovative approach to communication
between edge devices and cloud systems, ensuring optimal resource utilization and efficient
data processing. The workflow involves four primary stages: edge processing, selective
transmission, cloud processing, and result integration. Each stage is meticulously designed to
balance computational efficiency and accuracy, reducing unnecessary data transfers while
maintaining high detection performance.

1. Edge Processing
The edge device captures raw input data, such as images or sensor readings, and preprocesses
it to ensure compatibility with the edge model. A lightweight Small Model runs locally on the
edge device to analyze the input data.
This model is optimized for speed and energy efficiency, capable of making decisions for
straightforward cases (e.g., objects with high confidence levels or simple classifications).
A confidence threshold is applied to predictions:
 If the confidence score exceeds the threshold, the data is processed and resolved
locally.
 If the confidence score falls below the threshold, the data is flagged as a "difficult
case" for further processing.
2. Difficult-Case Discrimination
The Difficult-Case Discriminator (DCD) evaluates flagged data to ensure only essential
information is sent to the cloud.
The flagged data is packaged efficiently, often compressed or reduced to its most critical
features, to minimize bandwidth usage.
The DCD is an integral part of the framework, allowing the system to maintain a balance
between bandwidth savings and data integrity.

Gurupriya D
3. Selective Transmission
Transmission of Flagged Data:
 Only flagged "difficult cases" are transmitted to the cloud. This selective
communication drastically reduces the volume of data sent compared to traditional
systems that offload all data.
 Lightweight serialization techniques, such as Protocol Buffers or MessagePack, are
often used to further compress the data packets before transmission.
Communication Optimization:

 Efficient communication protocols, such as MQTT or WebSocket, are employed to


ensure low-latency and reliable data transfer.
 Metadata, including timestamps and confidence levels, is attached to each data packet
to prioritize cloud processing of critical cases.
4. Cloud Processing
The Big Model hosted on the cloud receives the flagged data for in-depth analysis.
This model is designed for high computational capacity, leveraging advanced algorithms and
a larger dataset to refine predictions and improve accuracy.
The cloud model produces detailed results, particularly for complex scenarios that the edge
model could not handle effectively.
5. Result Integration
The refined results from the cloud are sent back to the edge device for integration.
The edge system combines the cloud's outputs with locally processed results, ensuring a
unified and accurate output.
This integration process enables the system to deliver real-time insights with minimal latency,
utilizing both edge and cloud resources effectively.

Testing and Evaluation of the Datasets


In evaluating the Small-Big Model Framework, the researchers conducted extensive testing
on several benchmark datasets, simulating real-world conditions where edge devices are
required to process data efficiently and transmit it to the cloud for further analysis. The
evaluation is designed to assess both the performance of the edge models and the efficiency
of the communication between the edge and cloud components.

1. Dataset Selection
For the testing phase, various image and sensor-based datasets were selected to reflect
different levels of complexity and real-world applicability:

Gurupriya D
 VOC (Pascal Visual Object Classes): A widely used benchmark for object detection
tasks, the VOC dataset provides labeled data on images with various object classes,
making it ideal for testing edge-based object recognition models.
 COCO (Common Objects in Context): This dataset consists of diverse images with
annotations for object detection, segmentation, and captioning. It includes more
complex scenarios with objects in varied contexts, making it suitable for the cloud-
based Big Model's processing.
 HELMET: A specialized dataset designed for object detection, particularly in
environments like industrial settings where precise detection of equipment or human
activity is required.
These datasets offer a mix of simple and complex tasks that allow for testing the Difficult-
Case Discriminator (DCD)'s ability to distinguish between easily resolvable cases and those
that require cloud offloading.

2. Preprocessing and Model Setup


Before testing, the datasets were pre-processed to ensure that they were compatible with both
the Small and Big Models:
 Preprocessing: The data was normalized, resized, and converted into formats suitable
for efficient processing on edge devices and cloud infrastructure. For instance, images
were resized to reduce computational load while maintaining sufficient resolution for
detection tasks.
 Model Configuration: The edge device ran a Small Model, which was a lightweight,
resource-efficient object detection model. The cloud system hosted a more
computationally expensive Big Model capable of handling complex detection tasks.

3. Testing Process
The testing phase followed a clear progression:
 Initial Edge Processing: The Small Model processed input from the datasets directly
on the edge device. For simple object detection cases, such as identifying well-defined
objects in clear environments, the Small Model handled the processing locally.
 Flagging Difficult Cases: For more complex or uncertain cases, such as detecting
partially occluded objects or objects in cluttered backgrounds, the Small Model
flagged the data as a "difficult case." These cases were then forwarded to the cloud for
detailed analysis by the Big Model.
 Data Transmission: The flagged data was serialized and compressed using
lightweight techniques like Protocol Buffers or MessagePack, ensuring efficient
transmission. The data was sent over low-latency communication protocols, such as
MQTT or WebSocket, to minimize the delay between edge and cloud processing.

Gurupriya D
 Cloud Processing and Feedback: Once the data reached the cloud, the Big Model
performed more comprehensive processing, refining the detection and making more
precise predictions. The results were sent back to the edge device for integration.
 Output Integration: The edge device combined its local processing results with the
cloud-based results to generate a final output, ensuring the system provided both real-
time responses for simple cases and high-accuracy results for complex scenarios.

4. Evaluation Metrics
The success of the Small-Big Model Framework was evaluated using a range of metrics:
 Accuracy: This was measured using standard object detection metrics like mean
average precision (mAP). For the datasets tested, the framework achieved a mAP
ranging from 91.22% to 92.52%, indicating high accuracy in both local and cloud-
based processing.
 Latency: The latency was evaluated based on the time taken for the system to process
data from the edge device to the cloud and back. The use of local processing ensured
minimal delays for straightforward cases, while the cloud-based processing added
slight delays for more complex cases. However, these delays were offset by the
efficiency of selective transmission and data compression.
 Bandwidth Efficiency: The transmission of flagged data was optimized to reduce
bandwidth consumption. About 50% of the images were processed locally on the
edge, with only the difficult cases sent to the cloud, significantly reducing the amount
of data transmitted and conserving network bandwidth.
 Energy Consumption: The edge devices were evaluated for energy efficiency by
measuring the power consumption during local processing and when transmitting
data. By minimizing the number of transmissions to the cloud, the system reduced the
overall energy usage on the edge device.

5. Results
Data Transmission Efficiency: By only transmitting difficult cases, the framework reduced
the amount of data sent to the cloud by 50% compared to traditional systems that would send
all data regardless of complexity.
Performance Gains: The Small-Big Model achieved a high balance of accuracy, bandwidth
efficiency, and latency, making it suitable for real-time applications where edge devices have
limited resources but require robust cloud-based support.

Gurupriya D
Benefits of the Framework
The Small-Big Model Framework offers several key benefits:
 Bandwidth Optimization: By processing easy cases locally, the framework
significantly reduces the amount of data transmitted to the cloud.
 Reduced Latency: Local processing ensures quicker response times for
straightforward cases, while only complex cases experience cloud-related delays.
 Scalability: The framework is adaptable to a wide range of edge devices, making it
suitable for diverse applications.
 Energy Efficiency: By minimizing cloud dependency, the framework reduces energy
consumption on both the edge device and the cloud.

Conclusion
The Small-Big Model Framework provides a practical and efficient solution for edge-to-
cloud communication, addressing critical challenges in bandwidth usage, latency, and energy
consumption. Its innovative approach to selective data transmission ensures that the system
remains scalable and accurate, even in resource-constrained environments. This framework
holds great promise for applications requiring real-time processing and efficient resource
management.

Gurupriya D
XAI and Blockchain-Powered Edge-to-Cloud System
for DoS Mitigation in IoT
The Vulnerability of IoT Networks:

The sources emphasize the rapid proliferation of IoT devices and their inherent vulnerabilities,
making them prime targets for DoS attacks. These vulnerabilities stem from factors such as:

●Limited Resources: IoT devices typically have constrained processing power, memory, and
energy, making them susceptible to attacks that overwhelm their resources.
●Insecure Communication: Many IoT devices rely on wireless communication protocols
that lack robust security measures, making them prone to interception and manipulation.
●Lack of Standardization: The diverse range of IoT devices and protocols often leads to
inconsistencies in security implementations, creating vulnerabilities that attackers can exploit.

The Need for Enhanced Security Measures:


Traditional security solutions, designed for centralized networks, often prove inadequate for
protecting the distributed and heterogeneous nature of IoT ecosystems. We propose combining
XAI and Blockchain to address the unique security challenges posed by IoT networks:

1. XAI for Transparent and Trustworthy DoS Detection:

The sources highlight the importance of explainability in AI-based security systems XAI
techniques, such as SHAP (Shapley Additive exPlanations), provide insights into the
decision-making process of AI models, enhancing trust and enabling informed responses to
detected anomalies.

● Understanding Feature Importance: XAI helps identify the most influential features
contributing to the detection of DoS attacks. This allows security analysts to:
○Fine-tune detection models for greater accuracy and efficiency.
○Establish security policies based on thresholds for critical features.
○Gain a deeper understanding of the nature and characteristics of attacks.
● Reducing False Positives: By explaining why a particular network flow is flagged as an
anomaly, XAI helps distinguish between legitimate traffic bursts and genuine DoS attacks. This
reduces the likelihood of mistakenly blocking harmless traffic.

Bhavya
Key Components of XAI Implementation

● Local Explainability for Edge Devices


○ Purpose: To explain individual detection decisions made at the edge layer.
○ Approach:
■ Use lightweight explainability techniques such as SHAP (SHapley
Additive exPlanations) and LIME (Local Interpretable Model-agnostic
Explanations).
■ Generate feature-level explanations for each traffic sample flagged as
anomalous.
○ Example:
■ If a CNN-LSTM model detects a traffic flow as a DoS attack, SHAP can
explain that features like a high packet rate and a low inter-packet arrival
time contributed most to the decision.
○ Implementation Steps:
■ Train an edge model on labeled traffic data (e.g., NSL-KDD dataset).
■ Integrate SHAP to calculate the contribution of each feature to the model's
predictions.
■ Provide visual explanations or summary reports to the security operator.
● Global Explainability for Cloud Analysis
○ Purpose: To provide insights into broader patterns and trends in network
behavior.
○ Approach:
■ Use global explainability methods such as feature importance analysis and
visualization techniques (e.g., bar charts, heatmaps).
■ Apply these techniques to advanced models like Graph Neural Networks
(GNNs) at the cloud layer to explain relationships between nodes
(devices) and edges (traffic flows).
○ Example:
■ A GNN analyzing the IoT network topology may identify a cluster of
devices exhibiting synchronized abnormal traffic patterns, indicating a
coordinated DoS attack.
● Contextual Explanations
○ Purpose: To enhance interpretability by incorporating contextual information
about IoT devices and traffic patterns.
○ Approach:
■ Use device metadata (e.g., device type, typical behavior patterns) and
network characteristics (e.g., peak hours, baseline traffic levels) to
contextualize anomaly detections.

Bhavya
○ Example:
■ A spike in traffic from a sensor during maintenance hours may be flagged
as an anomaly. However, XAI can contextualize this spike using historical
data and explain it as non-malicious.
● Visualization Tools
○ Purpose: To present explanations in an intuitive and actionable manner.
○ Techniques:
■ SHAP Summary Plots: Show the average contribution of each feature
across all flagged instances.
■ SHAP Waterfall Plots: Break down the cumulative effect of features
leading to a specific prediction.
■ Feature Importance Heatmaps: Visualize the importance of various
features across different traffic samples.

Waterfall plot:

Bhavya
2. Blockchain for Secure Data Management and Collaboration:

The sources propose leveraging blockchain technology to enhance security and collaboration in
the IoT ecosystem:
● Immutable and Transparent Logging: Blockchain provides a tamper-proof and
auditable record of detected DoS events and blacklisted IP addresses This:
○Enhances accountability and trust among network participants.
○Facilitates forensic analysis and investigation of attacks.
○Prevents attackers from manipulating or erasing evidence of their actions.

Bhavya
● Decentralized Threat Intelligence: Blockchain enables secure sharing of threat
intelligence among distributed IoT devices and network nodes. This allows:
○Real-time updates on emerging threats and attack patterns.
○Collaborative mitigation efforts to block malicious traffic at multiple points.
○Increased resilience against attacks targeting individual devices or nodes.

Core Components of Blockchain Implementation

● Decentralized Traffic Logging


○ Purpose:
■ Blockchain ensures that traffic data and anomaly detection logs are
securely recorded, preventing tampering and providing a reliable source
for forensic analysis.
○ How It Works:
■ Each anomaly detected by edge devices is encapsulated as a transaction
and added to a Blockchain ledger.
■ The ledger is replicated across multiple Blockchain nodes, creating a
decentralized and tamper-proof record.
○ Application in DoS Detection:
■ Logs include attributes such as source and destination IP addresses, packet
sizes, traffic rates, timestamps, and detection results.
■ In case of a large-scale DoS attack, these logs allow administrators to trace
back to the origin and identify patterns of malicious behavior.
● Smart Contracts for Automated Responses
○ Purpose:
■ Smart contracts automate defensive actions in response to detected
anomalies, reducing human intervention and ensuring consistent
enforcement of security policies.
○ How It Works:
■ A smart contract is a programmable script stored on the Blockchain that
executes predefined actions when certain conditions are met.
■ For example:
■ If multiple edge devices report traffic from the same source IP as
anomalous, the smart contract can block that IP across the network.
■ If traffic rates exceed a specific threshold, the smart contract can
throttle the connection or notify administrators.
○ Benefits:
■ Accelerates response times during active attacks.
■ Ensures consistent execution of security policies across all devices.

Bhavya
● Decentralized Anomaly Verification
○ Purpose:
■ Reduces reliance on a single point of failure by distributing the
verification of detected anomalies across multiple nodes.
○ How It Works:
■ Each Blockchain node validates anomalies reported by edge devices by
cross-referencing the traffic patterns logged in the ledger.
■ If a consensus is reached among nodes, the anomaly is flagged as verified,
triggering appropriate responses.
○ Application in DoS Detection:
■ Helps in detecting Distributed DoS (DDoS) attacks by correlating logs
from multiple devices to identify coordinated patterns.
● Immutable Forensic Records
○ Purpose:
■ Provides a tamper-proof history of network events for post-attack analysis
and compliance reporting.
○ How It Works:
■ Once logged, entries cannot be altered or deleted, ensuring that all records
are trustworthy and auditable.
○ Use Case:
■ In regulatory environments (e.g., healthcare IoT), Blockchain-based logs
can demonstrate compliance with cybersecurity standards.

Bhavya
3. Edge-to-Cloud Architecture for Efficient and Scalable DoS
Mitigation:

Integrating XAI and Blockchain into an Edge-to-Cloud architecture optimizes DoS detection and
mitigation in IoT networks:
● Edge Computing for Rapid Local Detection: Lightweight XAI-enabled detection
models can be deployed on resource-constrained edge devices for real-time anomaly detection.
This enables:
○Quick identification and isolation of potential DoS attacks close to the source.
○Reduced latency in response time, minimizing the impact of attacks.
○Offloading computationally intensive tasks from the cloud.

Bhavya
● Cloud Computing for Verification and Collaboration: The cloud serves as a
central hub for:
○Verifying anomalies detected at the edge using more sophisticated XAI models.
○Maintaining the blockchain network for secure data storage and collaboration.
○Coordinating mitigation efforts across the entire IoT ecosystem.

Implementation Plan for XAI and Blockchain Integration in


Edge-to-Cloud DoS Detection

1. Framework Overview

The proposed system leverages Explainable AI (XAI) and Blockchain technologies within an
Edge-to-Cloud architecture to enhance the detection and mitigation of Denial-of-Service (DoS)
attacks. The framework consists of two primary layers: the edge layer, responsible for real-time
detection and explanation of anomalies, and the cloud layer, which performs comprehensive
analysis and coordinated responses. Blockchain underpins the entire system, ensuring secure,
tamper-proof communication and logging.

2. Dataset Preparation

For implementation, we use multiple kinds of datasets like CICDDoS2019, CIC-IoT2023, etc.
we even use of the NSL-KDD dataset, a benchmark dataset for network intrusion detection,
supplemented with synthetic IoT-specific traffic to simulate diverse attack scenarios. The
datasets includes attributes such as:

● Numerical features (e.g., src_bytes, dst_bytes, packet_rate).


● Categorical features (e.g., protocol_type, service, flag).
● Labels indicating normal or attack types (e.g., DoS, Probe, R2L).

Preprocessing Steps:

1. Normalize numerical features to ensure uniform scaling.


2. Encode categorical attributes using one-hot encoding or ordinal encoding.
3. Split the dataset into:
○ Training Data: 70% for edge model training.
○ Testing Data: 30% for performance evaluation.

Bhavya
3. Edge Layer Implementation

The edge layer focuses on real-time detection of anomalous traffic using a lightweight machine
learning model. A CNN-LSTM model is suitable here, as it captures both spatial and temporal
patterns in the traffic data.

● Model Functionality:
○ The CNN extracts spatial features (e.g., traffic intensity patterns).
○ The LSTM identifies sequential anomalies, such as traffic bursts indicative of
DoS attacks.
○ The output is a binary or multi-class prediction (e.g., Normal, DoS).
● XAI Integration:
○ SHAP (SHapley Additive exPlanations) is employed to provide interpretability.
○ For example, if a traffic sample is flagged as a DoS attack, SHAP can show that
high packet rates and low inter-packet arrival times were the key contributing
factors.

4. Blockchain Integration

Blockchain is integrated to ensure secure, tamper-proof communication and decentralized


coordination between edge devices and the cloud.

● Traffic Logging:
○ Each detected anomaly is logged with attributes such as source IP, destination IP,
timestamp, and detected attack type.
○ Logs are distributed across Blockchain nodes, ensuring data immutability.
● Smart Contracts:
○ Pre-programmed rules automate actions such as:
■ Blocking malicious IPs.
■ Throttling traffic from flagged devices.
○ For instance, a detected DoS attack triggers a smart contract to notify all
connected devices to isolate the source node.

Bhavya
5. Cloud Layer Implementation

The cloud layer aggregates and verifies anomalies flagged by edge devices using advanced
analytical models.

● Model Usage:
○ Graph Neural Networks (GNNs) analyze the IoT network's topology.
○ Nodes represent devices, and edges represent traffic flows.
○ The model detects patterns indicative of coordinated DoS attacks, such as
abnormal traffic from clusters of devices.
● Global Analysis:
○ Consolidates data from multiple edge nodes.
○ Correlates localized anomalies to detect distributed DoS (DDoS) attacks.
● Blockchain Validation:
○ Cloud systems query Blockchain logs to validate edge-detected anomalies.
○ If multiple edge devices report consistent anomalies, the cloud flags them as
verified attacks.

7. Challenges and Future Directions

● Challenges:
○ The computational overhead of Blockchain in resource-constrained IoT devices.
○ Balancing the latency of real-time detection with the need for detailed XAI
explanations.
○ Scaling the framework for large IoT networks with millions of devices.
● Future Work:
○ Explore lightweight Blockchain implementations such as Directed Acyclic
Graphs (DAGs) for reduced computational demands.
○ Optimize XAI techniques for faster explanations without sacrificing
interpretability.
○ Incorporate federated learning to train detection models collaboratively across
devices while preserving data privacy.
○ Deploy the framework in real-world IoT environments, such as smart cities or
healthcare systems, to validate scalability and robustness.

Bhavya
8. Conclusion

The proposed Edge-to-Cloud system combining XAI and Blockchain addresses critical
challenges in IoT network security against DoS attacks. By ensuring real-time detection,
transparency, and secure communication, the framework offers a robust solution for modern IoT
environments. While challenges remain in terms of computational efficiency and scalability, the
integration of these technologies marks a significant step forward in safeguarding IoT networks
from evolving cyber threats.

Bhavya
References
Cao, Zhiqiang, et al. "Edge-Cloud collaborated object detection via difficult-case discriminator." 2023
IEEE 43rd International Conference on Distributed Computing Systems (ICDCS). IEEE, 2023.
https://arxiv.org/abs/2108.12858

Andriulo, Francesco Cosimo, et al. "Edge Computing and Cloud Computing for Internet of Things: A
Review." Informatics. Vol. 11. No. 4. MDPI, 2024.
https://doi.org/10.3390/informatics11040071

https://www.wireshark.org/about.html
https://www.kali.org/docs/introduction/what-is-kali-linux/
https://youtu.be/KytAmziXs4k?si=Lx4uJJtFEVYo-Ea4

Shah, Syed Ali Raza, and Biju Issac. "Performance comparison of intrusion detection systems and
application of machine learning to Snort system." Future Generation Computer Systems 80 (2018):
157-170.https://doi.org/10.1016/j.future.2017.10.016

Manikumar, D. V. V. S., and B. Uma Maheswari. "Blockchain based DDoS mitigation using machine
learning techniques." 2020 Second international conference on inventive research in computing
applications (ICIRCA). IEEE, 2020.https://doi.org/10.1109/ICIRCA48905.2020.9183092

Kumari, Pooja, et al. "Leveraging blockchain and machine learning to counter DDoS attacks over IoT
network." Multimedia Tools and Applications (2024): 1-25.
https://link.springer.com/article/10.1007/s11042-024-18842-4

Kumar, Prabhat, et al. "Blockchain and explainable AI for enhanced decision making in cyber threat
detection." Software: Practice and Experience (2024).
https://onlinelibrary.wiley.com/doi/full/10.1002/spe.3319

Kalutharage, Chathuranga Sampath, et al. "Explainable AI-based DDOS attack identification method for
IoT networks." Computers 12.2 (2023): 32. https://doi.org/10.3390/computers12020032

Almadhor, Ahmad, et al. "Strengthening network DDOS attack detection in heterogeneous IoT
environment with federated XAI learning approach." Scientific Reports 14.1 (2024): 24322.
https://www.nature.com/articles/s41598-024-76016-6
Contribution of the members:

NAME CONTRIBUTION

Sri Nithya Bandi - Introduction and description of DoS and DDoS attacks
- Different types of attacks on edge device and cloud
system
- Differences of of DoS attack detection on edge device
and cloud system
- DoS attack detection techniques analysis on edge device
and Cloud system
- Challenges of Detection on Edge vs.Cloud
- Current Industry Standards & comparison with proposed
solution

Bhavya Reddy Sama - XAI and Blockchain-Powered Edge-to-Cloud System for


DoS Mitigation in IoT
- XAI for Transparent and Trustworthy DoS Detection
- Blockchain for Secure Data Management and
Collaboration
- Edge-to-Cloud Architecture for Efficient and Scalable
DoS Mitigation
- Implementation Plan for XAI and Blockchain Integration
in Edge-to-Cloud DoS Detection
- Current Industry Standards & comparison with proposed
solution

Sai Varun Ragi - What is flooding DoS attacks and their types
- Performing a DoS attacking on MetaSploitable2 using
Kali Linux and collecting the packets via Wireshark
- Short comparison of different types of attacks
- Snort IDE - One of the current industrial standards for
detection of DoS attacks

Pranjal Bhardwaj - Data preprocessing and feature normalization for


NSL-KDD dataset
- Feature selection and analysis for effective DoS detection
- Edge-based CNN-LSTM model for real-time DoS
detection
- Performance metrics and resource usage optimization for
developed edge model
- Challenges and solutions for implementing and
improving the accuracy of the CNN-LSTM model for
real-time DoS detection
- Current Industry Standards & comparison with proposed
solution

Sreyash Somesh Mishra - Cloud-based Random Forest model for DoS attack
detection
- Cloud-based ANN model for DoS attack detection
- Justification for employing Random forest model for
cloud based DoS attack detection in IoT networks
- Comparison with some other models for cloud
deployment
- Dataset information & Data pre-processing for
NSL-KDD dataset
- Current Industry Standards & comparison with proposed
solution

Gurupriya D - The need for enhanced communication protocol between


the Edge and the Cloud
- Different communication protocols for enhanced Edge to
Cloud synchronization
- Introduction to the Small-Big Model Framework
- Communication workflow of the Model
- Testing and Evaluation of the model with different
datasets

You might also like