0% found this document useful (0 votes)

3 views

Semi-supervised Machine Learning Approach for DDoS Detection (2)

This document presents a semi-supervised machine learning approach for detecting DDoS attacks and malware on Android platforms by analyzing network traffic as text documents using natural language processing (NLP) techniques. The proposed methodology includes a feature selection algorithm and co-clustering to improve detection accuracy while reducing false positives. The system demonstrates a high detection rate of 99.15% for harmful applications, outperforming existing antivirus solutions.

Uploaded by

pikkiribabu

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Semi-supervised Machine Learning Approach for DDoS Detection (2)

Uploaded by

pikkiribabu

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

SEMI-SUPERVISED MACHINE LEARNING

APPROACH FOR DDOS DETECTION

ABSTRACT:
The appearance of malicious apps is a serious threat to the Android
platform. Most types of network interfaces based on the integrated functions, steal
users' personal information and start the attack operations. In this paper, we
propose an effective and automatic malware detection method using the text
semantics of network traffic. In particular, we consider each HTTP flow generated
by mobile apps as a text document, which can be processed by natural language
processing to extract text-level features. Later, the use of network traffic is used to
create a useful malware detection model. We examine the traffic flow header using
N-gram method from the natural language processing (NLP). Then, we propose an
automatic feature selection algorithm based on chi-square test to identify
meaningful features. It is used to determine whether there is a significant
association between the two variables. We propose a novel solution to perform
malware detection using NLP methods by treating mobile traffic as documents. We
apply an automatic feature selection algorithm based on N-gram sequence to obtain
meaningful features from the semantics of traffic flows. Our methods reveal some
malware that can prevent detection of antiviral scanners. In addition, we design a
detection system to drive traffic to your own-institutional enterprise network, home
network, and 3G / 4G mobile network. Integrating the system connected to the
computer to find suspicious network behaviors.
Index Terms—Malware detection, HTTP flow analysis, text semantics,
machine learning.

ARCHITECTURE:

EXISTING SYSTEM:
The first phase of their approach consists of dividing the incoming network traffic
into three type of protocols TCP, UDP or Other. Then classifying it into normal or
anomaly traffic. In the second stage a multi-class algorithm classify the anomaly
detected in the first phase to identify the attacks class in order to choose the
appropriate intervention. Two public datasets are used for experiments in this
paper namely the UNSW-NB15 and the NSL-KDD Several approaches
have been proposed for detecting DDoS attack. Information
theory and machine learning are the The performances of network
intrusion detection approaches, in general, rely on the distribution characteristics of
the underlaying network traffic data used for assessment. The DDoS detection
approaches in the literature are under two main categories unsupervised
approaches and supervised approaches. Depending on the benchmark datasets
used, unsupervised approaches often suffer from high false positive rate and
supervised approach cannot handle large amount of network traffic data and their
performances are often limited by noisy and irrelevant network data. Therefore, the
need of combining both, supervised and unsupervised approaches arises to
overcome DDoS detection issues.

DISADVANTAGES:
 The datasets above are split into train subsets and test subsets using a
configuration of 60% and 40% respectively. The train subsets are used to fit
the Extra-Trees ensemble classifiers and the test subsets are used to test the
entire proposed approach. Before fitting the classifiers the train subsets are
normalized using the MinMax method
 This section presents the details of the proposed approach and the
methodology followed for detecting the DDoS attack. The proposed
approach consists of five major steps: Datasets preprocesing, estimation of
network traffic Entropy, online co-clustering, information gain ratio

 The aim of splitting the anomalous network traffic is to reduce the amount of
data to be classified by excluding the normal cluster for the classification.
For DDoS detection normal traffic records are irrelevant and noisy as the
normal behaviors continue to evolve. Most of the time the new unseen
normal traffic instances cause the increase of the false positive rate and the
decrease of the classification accuracy. Hence, excluding some noisy normal
instances of the network traffic data for classification is beneficial in terms
of low false positive rates and classification accuracy. Assuming that after
the network traffic clustering one cluster contains only normal traffic, a
second one contains only DDoS traffic and a third one contains both DDoS
and normal traffic.

PROPOSED SYSTEM:

This sections introduces our methodology to detect the DDoS attack. The five-fold
steps application process of data mining techniques in network systems discussed
in characterizes the followed methodology. The main aim of combining algorithms
used in the proposed approach is to reduces noisy and irrelevant network traffic
data before preprocessing and classification stages for DDoS detection while
maintaining high performance in terms of accuracy, false positive rate and running
time, and low resources usage. Our approach starts with estimating the entropy of
the FSD features over a time-based sliding window. When the average entropy of a
time window exceeds its lower or upper thresholds the co-clustering algorithm split
the received network traffic into three clusters. Entropy estimation over time
sliding windows allows
to detect abrupt changes in the incoming network traffic distribution which are
often caused by DDoS attacks. Incoming network traffic within the time windows
having abnormal entropy values is suspected to contain DDoS traffic. The focus
only on the suspected time windows
allows to filter important amount of network traffic data, therefore only relevant
data is selected for the remaining steps of the proposed approach. Also, important
resources are saved when no abnormal entropy occurs. In order to determine the
normal cluster, we estimate the
information gain ratio based on the average entropy of the FSD features between
the received network traffic data during the current time window and each one of
the obtained clusters. As discussed in the previous section during a DDoS period
the generated amount of attack traffic is largely bigger than the normal traffic.
Hence, estimating the information gain ratio based on the FSD features allows to
identify the two cluster that preserve more information about the DDoS attack and
the cluster that contains only normal traffic. Therefore, the cluster that produce
lower information gain ratio is considered as normal and the remaining clusters are
considered as anomalous. The information gain ratio is computed for each cluster
as follows:
3.2.1 ADVANTAGE:
 Where subsetw represents the received subset of network data during the
time window w, Ci (i = 1, 2, 3) are the obtained clusters from subsetw and |
Ci | is the size of the ith cluster. avgH(subset) is the average entropy of the
FSD features of the input subset and |subset | represents the size
 The clustering of the incoming network traffic data allows to reduce
important amount of normal and noisy data before the preprocessing and
classification steps. More than 6% of a whole traffic dataset can be filtered .

MODULES:
There are three modules can be divided here for this project they are listed as
below
• User Apps
• DDOS Attack Deduction

• Classifications of DDOS attack

• Graphical analysis
From the above four modules, project is implemented. Bag of discriminative words
are achieved

1. User Apps
User handling for some various times of smart phones ,desktops laptops and
tablets .If any kind of devices attacks for some unauthorized Malware softwares .In
this Malware on threats for user personal dates includes for personal contact, bank
account numbers and any kind of personal documents are hacking in possible.
2. DDOS Attack Deduction

User search the any link Notably, not all network traffic data generated by
malicious apps correspond to malicious traffic. Many malware take the form of
repackaged benign apps; thus, Malware can also contain the basic functions of a
benign app. Subsequently, the network traffic they generate can be characterized by
mixed benign and malicious network traffic. We examine the traffic flow header
using Co-clustering algorithm from the natural language processing (NLP).

3.Classifications of DDOS Attack:

Here, we compare the classification performance of Co-clustering algorithm with
other popular machine learning algorithms. We have selected several popular
classification algorithms. For all algorithms, we attempt to use multiple sets of
parameters to maximize the performance of each algorithm. Using Co-clustering
algorithm algorithms classification for malware bag-of-words weightage.

4. Graphical analysis
The graph analysis is done by the values taken from the result analysis part and it
can be analyzed by the graphical representations. Such as pie chart, pyramid chart
and funnel chart here in this project.

ALGORITHM
Co-clustering algorithm performs a simultaneous clustering of rows and columns
of a data matrix based on a specific criterion . It produces clusters of rows and
columns which represent sub-matrices of the original data matrix with some
desired properties. Clustering simultaneously rows and columns of a data matrix
yields three major benefits: Dimensionality reduction, as each cluster is created
based on a subset of the original features. More compressed data representation
with preservation of information in the original data. Significant reduction of the
clustering computational complexity. The co-clustering computational complexity
is O(mkl + nkl) which is much smaller than that of the traditional Kmeans
algorithm O(mnk) . Where m is the number of rows, n is the number of columns, k
is the number of clusters and l is the number of column clusters.

REQUIREMENT ANALYSIS

The project involved analyzing the design of few applications so as to make

the application more users friendly. To do so, it was really important to keep the
navigations from one screen to the other well ordered and at the same time
reducing the amount of typing the user needs to do. In order to make the
application more accessible, the browser version had to be chosen so that it is
compatible with most of the Browsers.

REQUIREMENT SPECIFICATION

Functional Requirements

 Graphical User interface with the User.

Software Requirements

For developing the application the following are the Software Requirements:

1. Python

2. Django

3. Sqlite

Operating Systems supported

1. Windows 7

2. Windows XP

3. Windows 8

Technologies and Languages used to Develop

1. Python

Debugger and Emulator

 Any Browser (Particularly Chrome)
Hardware Requirements

For developing the application the following are the Hardware Requirements:

 Processor: Pentium IV or higher

 RAM: 256 MB
 Space on Hard Disk: minimum 512MB

CONCLUSION:
Android is a new and fastest growing threat to malware. Currently, many
research methods and antivirus scanners are not hazardous to the growing size and
diversity of mobile malware. As a solution, we introduce a solution for mobile
malware detection using network traffic flows, which assumes that each HTTP
flow is a document and analyzes HTTP flow requests using NLP string analysis.
The N-Gram line generation, feature selection algorithm, and SVM algorithm are
used to create a useful malware detection model. Our evaluation demonstrates the
efficiency of this solution, and our trained model greatly improves existing
approaches and identifies malicious leaks with some false warnings. The harmful
detection rate is 99.15%, but the wrong rate for harmful traffic is 0.45%. Using the
newly discovered malware further verifies the performance of the proposed
system. When used in real environments, the sample can detect 54.81% of harmful
applications, which is better than other popular anti-virus scanners. As a result of
the test, we show that malware models can detect our model, which does not
prevent detecting other virus scanners. Obtaining basically new malicious models
VirusTotal detection reports are also possible. Added, Once new tablets are added
to training samples, we will Please re-train and refresh and update the new
malware

Case 1 - 9 Engineering Management by Roberto Medina
86% (7)
Case 1 - 9 Engineering Management by Roberto Medina
3 pages
Failure Analysis Report
No ratings yet
Failure Analysis Report
9 pages
Semi Supervised Machine Learning Approach For DDOS Detection
No ratings yet
Semi Supervised Machine Learning Approach For DDOS Detection
6 pages
A System For Denial-Of-Service Attack Detection Based On Multivariate Correlation Analysis
No ratings yet
A System For Denial-Of-Service Attack Detection Based On Multivariate Correlation Analysis
6 pages
New Detect 2
No ratings yet
New Detect 2
23 pages
Project
No ratings yet
Project
20 pages
PAPERM
No ratings yet
PAPERM
14 pages
Layered Approach Using Conditional
No ratings yet
Layered Approach Using Conditional
5 pages
A Self Attentional Auto Encoder Based in PDF
No ratings yet
A Self Attentional Auto Encoder Based in PDF
9 pages
DDoS Attack Detection Using ML
No ratings yet
DDoS Attack Detection Using ML
6 pages
A Novel Framework For Intrusion Detection Using Distributed Collaboration Detection Scheme in Packet Header Data
No ratings yet
A Novel Framework For Intrusion Detection Using Distributed Collaboration Detection Scheme in Packet Header Data
16 pages
Cloud 4unit
No ratings yet
Cloud 4unit
11 pages
A Flow Based Method For Abnormal Attack Analysis v5 Revision
No ratings yet
A Flow Based Method For Abnormal Attack Analysis v5 Revision
14 pages
Ddos Attacks Detection Using Dynamic Entropy Insoftware-Defined Network Practical Environment
100% (1)
Ddos Attacks Detection Using Dynamic Entropy Insoftware-Defined Network Practical Environment
16 pages
DDoS Attacks Detection Using Dynamic Entropy in Software-Defined Network Practical Environment
No ratings yet
DDoS Attacks Detection Using Dynamic Entropy in Software-Defined Network Practical Environment
16 pages
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
No ratings yet
Apply Machine Learning Techniques To Detect Malicious Network Traffic in Cloud Computing
24 pages
Network Intrusion Detection Using Association Rules: Flora S. Tsai
No ratings yet
Network Intrusion Detection Using Association Rules: Flora S. Tsai
3 pages
Traceback of DDoS Attacks Using Entropy Variations Abstract
No ratings yet
Traceback of DDoS Attacks Using Entropy Variations Abstract
3 pages
BTP Presentation
No ratings yet
BTP Presentation
26 pages
Introduction
No ratings yet
Introduction
3 pages
Research
No ratings yet
Research
15 pages
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
No ratings yet
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
23 pages
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
No ratings yet
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
23 pages
paper1
No ratings yet
paper1
16 pages
Base Paper
No ratings yet
Base Paper
16 pages
Detection and Differentiation of Application Layer Ddos Attack From Flash Events Using Fuzzy-Ga Computation
No ratings yet
Detection and Differentiation of Application Layer Ddos Attack From Flash Events Using Fuzzy-Ga Computation
14 pages
Analysis & Study of Application Layer Distributed Denial of Service Attacks For Popular Websites
No ratings yet
Analysis & Study of Application Layer Distributed Denial of Service Attacks For Popular Websites
5 pages
On The Capability of An SOM Based Intrusion Detection System
No ratings yet
On The Capability of An SOM Based Intrusion Detection System
6 pages
J1NS01 Final
No ratings yet
J1NS01 Final
57 pages
VIJAYRAGAVAN CYBER PPT
No ratings yet
VIJAYRAGAVAN CYBER PPT
21 pages
Dos Attack Detection Using Machine Learning and Neural Network
No ratings yet
Dos Attack Detection Using Machine Learning and Neural Network
5 pages
Parallel Ranking Assist Against Distributed Reflection Denial of Service Attack
No ratings yet
Parallel Ranking Assist Against Distributed Reflection Denial of Service Attack
5 pages
MANET Full Document
100% (2)
MANET Full Document
80 pages
8 Srs
No ratings yet
8 Srs
22 pages
Hybrid Intrusion Detection System Abstract
No ratings yet
Hybrid Intrusion Detection System Abstract
6 pages
1.1 Motivation
No ratings yet
1.1 Motivation
65 pages
Bande
No ratings yet
Bande
15 pages
Thesis Network Traffic
100% (2)
Thesis Network Traffic
4 pages
DDoS(research_paper) (3)
No ratings yet
DDoS(research_paper) (3)
5 pages
Machine Learning Based DDos Detection ThroughNetFlow Analysi
No ratings yet
Machine Learning Based DDos Detection ThroughNetFlow Analysi
6 pages
Layered Approach Using Conditional Random Fields For Intrusion Detection
No ratings yet
Layered Approach Using Conditional Random Fields For Intrusion Detection
7 pages
Mitigation and Detection of DDOS Attacks Using Software Defined Network (SDN) and Machine Learning
No ratings yet
Mitigation and Detection of DDOS Attacks Using Software Defined Network (SDN) and Machine Learning
11 pages
A Main Project ON: Intrusion Detection System
No ratings yet
A Main Project ON: Intrusion Detection System
24 pages
A Novel Datamining Based Approach For Remote Intrusion Detection
No ratings yet
A Novel Datamining Based Approach For Remote Intrusion Detection
6 pages
High Performance NMF Based Intrusion Detection System For Big Data IoT Traffic
No ratings yet
High Performance NMF Based Intrusion Detection System For Big Data IoT Traffic
16 pages
TLS Encrypted Malware Detection On Network Flow Using Accelerated Tools
No ratings yet
TLS Encrypted Malware Detection On Network Flow Using Accelerated Tools
16 pages
Unsupervised Network Anomaly Detection
No ratings yet
Unsupervised Network Anomaly Detection
4 pages
Vijayragavan Cyber Ppt
No ratings yet
Vijayragavan Cyber Ppt
21 pages
Poster Team META VIGU
No ratings yet
Poster Team META VIGU
1 page
A Comparative Study of Hidden Markov Model and Sup
No ratings yet
A Comparative Study of Hidden Markov Model and Sup
10 pages
V3i310 PDF
No ratings yet
V3i310 PDF
3 pages
Base Paper Interview
No ratings yet
Base Paper Interview
5 pages
Networksniffer
No ratings yet
Networksniffer
25 pages
Intrusion Detection Systems by Anamoly-Based Using Neural Network
No ratings yet
Intrusion Detection Systems by Anamoly-Based Using Neural Network
6 pages
Using Clustering To Perform Anomaly Detection For Intrusion Detection
No ratings yet
Using Clustering To Perform Anomaly Detection For Intrusion Detection
21 pages
4
No ratings yet
4
11 pages
Major Project Research
No ratings yet
Major Project Research
6 pages
Minor Project
No ratings yet
Minor Project
17 pages
A System For Denial-of-Service Attack Detection Based On Multivariate Correlation Analysis
No ratings yet
A System For Denial-of-Service Attack Detection Based On Multivariate Correlation Analysis
10 pages
DDos Attack Prediction - DL
No ratings yet
DDos Attack Prediction - DL
5 pages
Performance Analysis of AODV Routing Protocol Under The Different Attacks Through The Use of OPNET Simulator
No ratings yet
Performance Analysis of AODV Routing Protocol Under The Different Attacks Through The Use of OPNET Simulator
5 pages
Automatic Target Recognition: Fundamentals and Applications
From Everand
Automatic Target Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
ab
No ratings yet
ab
25 pages
SEMSPIRAL
No ratings yet
SEMSPIRAL
33 pages
nssrpt
No ratings yet
nssrpt
33 pages
Hire Your Geek
No ratings yet
Hire Your Geek
4 pages
Aparna Org
No ratings yet
Aparna Org
35 pages
NANDHU2
No ratings yet
NANDHU2
29 pages
22102061 c5 System Software and Operating System May 2022
No ratings yet
22102061 c5 System Software and Operating System May 2022
2 pages
Driver Booking
No ratings yet
Driver Booking
4 pages
Paper Recycling 1
No ratings yet
Paper Recycling 1
3 pages
Cipher It
No ratings yet
Cipher It
4 pages
Hardely's Travelogue
No ratings yet
Hardely's Travelogue
4 pages
Construction Assistance
No ratings yet
Construction Assistance
4 pages
music genre classification
No ratings yet
music genre classification
3 pages
Circuit 1C: Photoresistor: Parts Needed
No ratings yet
Circuit 1C: Photoresistor: Parts Needed
3 pages
3 Living With A Clean Liver 2 - Five Habits To Make
100% (1)
3 Living With A Clean Liver 2 - Five Habits To Make
107 pages
Festival of Lights??: Rabbi Maury Grebenau
No ratings yet
Festival of Lights??: Rabbi Maury Grebenau
2 pages
Concept of Entreprenuer
No ratings yet
Concept of Entreprenuer
29 pages
Evolution of Airfield Design Philosophies - Rollings - 98
No ratings yet
Evolution of Airfield Design Philosophies - Rollings - 98
11 pages
Morphology
67% (3)
Morphology
9 pages
MPS_Lab Manual 2025 (2)
No ratings yet
MPS_Lab Manual 2025 (2)
20 pages
Touching Feeling: Gy, Erformativity
No ratings yet
Touching Feeling: Gy, Erformativity
15 pages
How To Learn Math and Physics
100% (2)
How To Learn Math and Physics
13 pages
Power Ascender
No ratings yet
Power Ascender
6 pages
19 IPv6 Basics
No ratings yet
19 IPv6 Basics
42 pages
Mini Proposal of Increase 2020
No ratings yet
Mini Proposal of Increase 2020
25 pages
Formula Sheet Business Math 2
No ratings yet
Formula Sheet Business Math 2
2 pages
PE and Arts Cala
No ratings yet
PE and Arts Cala
6 pages
Siemens Sinamics G120 PM250 Power Module Manual PDF
No ratings yet
Siemens Sinamics G120 PM250 Power Module Manual PDF
78 pages
KCV-A374 User Manual PDF
No ratings yet
KCV-A374 User Manual PDF
8 pages
Mini Project: Hyderabad Karnataka Education Society's
No ratings yet
Mini Project: Hyderabad Karnataka Education Society's
4 pages
4 Periodic Table of Elements
No ratings yet
4 Periodic Table of Elements
17 pages
Koontz and Newig 2014
No ratings yet
Koontz and Newig 2014
49 pages
1 of 3. PHILOSOPHY OF TECHNOLOGY
No ratings yet
1 of 3. PHILOSOPHY OF TECHNOLOGY
3 pages
1.1. Chapter1. IntroductionToPrincipleofAccounting
No ratings yet
1.1. Chapter1. IntroductionToPrincipleofAccounting
54 pages
11th Economics Full Study Material English Medium 2023-24
75% (4)
11th Economics Full Study Material English Medium 2023-24
75 pages
Kellogg Conference Hotel Fact Sheet English
No ratings yet
Kellogg Conference Hotel Fact Sheet English
2 pages
Field Report Memorandum Example
No ratings yet
Field Report Memorandum Example
2 pages
Ho - Diagnostics Examples 2 in SPSS
No ratings yet
Ho - Diagnostics Examples 2 in SPSS
4 pages
REST API Design
No ratings yet
REST API Design
13 pages
ps2 1
No ratings yet
ps2 1
5 pages
Evaluation of Combine Harvester Operation Costs in
No ratings yet
Evaluation of Combine Harvester Operation Costs in
7 pages

Semi-supervised Machine Learning Approach for DDoS Detection (2)

Uploaded by

Semi-supervised Machine Learning Approach for DDoS Detection (2)

Uploaded by

SEMI-SUPERVISED MACHINE LEARNING

APPROACH FOR DDOS DETECTION

• Classifications of DDOS attack

3.Classifications of DDOS Attack:

The project involved analyzing the design of few applications so as to make

 Graphical User interface with the User.

Operating Systems supported

Technologies and Languages used to Develop

Debugger and Emulator

 Processor: Pentium IV or higher

You might also like