ISOT-Dataset-Overview-v0.5

The ISOT dataset combines various publicly available malicious and non-malicious datasets, including traffic from the Storm and Waledac botnets, as well as everyday usage traffic from the Ericsson Research and Lawrence Berkeley National Lab. The dataset was created by merging these sources to simulate a real-world bot-infected subnet, with a total of 1,675,424 unique flows, of which 55,904 are malicious. This comprehensive dataset is intended for training and evaluating network behavior analysis and machine learning techniques for botnet detection.

Uploaded by

layibepatalet05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

ISOT-Dataset-Overview-v0.5

Uploaded by

layibepatalet05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

ISOT Dataset Overview

The ISOT dataset is the combination of several existing publicly available malicious and
non-malicious datasets.
We obtained and used two separate datasets containing malicious traffic from the French
chapter of the honeynet project [1] involving the Storm and Waledac botnets,
respectively. Waledac is currently one of the most prevalent P2P botnets and is widely
considered as the successor of the Storm botnet with a more decentralized
communication protocol. Unlike Storm using overnet as a communication channel,
Waledac utilizes HTTP communication and a fast-flux based DNS network exclusively.
To represent non-malicious, everyday usage traffic, we incorporated two different
datasets, one from the Traffic Lab at Ericsson Research in Hungary [2] and the other from
the Lawrence Berkeley National Lab (LBNL) [3]. The Ericsson Lab dataset contains a
large number of general traffic from a variety of applications, including HTTP web
browsing behavior, World of Warcraft gaming packets, and packets from popular
bittorrent clients such as Azureus. We also incorporated all the datasets from the LBNL
trace data to provide additional non-malicious background traffic. The LBNL is a
research institute with a medium-sized enterprise network. The LBNL trace data consists
of five datasets labeled D0…D4; Table 1 provides general information for each of the
datasets. The recording of the network trace happened over three months period, from
October 2004 to January 2005 covering 22 subnets. The dataset contains trace data for a
variety of network activities spanning from web and email to backup and streaming
media. This variety of traffic serves as a good example of day-to-day use of enterprise
networks.

Table : LBNL datasets general information

D0 D1 D2 D3 D4
Date Oct 4, 04 Dec 15, 04 Dec 16, 04 Jan 6, 05 Jan 7, 05
Duration 10 min 1 hour 1 hour 1 hour 1 hour
Number of 22 22 22 18 18
Subnets
Number of 2,531 2,102 2,088 1,561 1,558
Hosts
Number of 18M 65M 28M 22M 28M
Packets

In order to produce an experimental dataset with both malicious and non-malicious

traffic, we merged the above datasets into a single individual trace file via a specific
process. First we mapped the IP addresses of the infected machines to two of the
machines providing the background traffic. Second, we replayed all of the trace files
using the TcpReplay tool on the same network interface card in order to homogenize the
network behavior exhibited by all three datasets; this replayed data is then captured via
wireshark for evaluation. Figure 1 depicts this merging process.
The final evaluation data produced by this process was further merged with all datasets
from the LBNL trace data to provide one extra subnet to even simulate a real enterprise
size network with thousands of hosts. The resulted evaluation dataset contains 22 subnets
from the LBNL with non-malicious traffic and one subnet (172.16.0.0/16) as illustrated in
Figure 1 with both malicious and non-malicious traffic and this traffic appears to be
originating from the same machines.

Fig . Dataset merging process

Table : List of machines that generate malicious/non-malicious traffic and corresponding labels.

IP Address Type of Traffic Generated Label of Malicious Traffic

172.16.2.11 Malicious/ UDP (Storm) Src/Dst MAC BB:BB:BB:BB:BB:BB
172.16.0.2 Malicious/ SMTP Spam Src/Dst MAC AA:AA:AA:AA:AA:AA
(Waledac)
172.16.0.11 Malicious/ SMTP Spam Src/Dst MAC AA:AA:AA:AA:AA:AA
(Waledac)
172.16.0.12 Malicious/ SMTP Spam (Storm) Src/Dst MAC AA:AA:AA:AA:AA:AA
172.16.2.2 Non-Malicious Normal Src/Dst MAC
172.16.2.3 Non-Malicious Normal Src/Dst MAC
172.16.2.11 Non-Malicious Normal Src/Dst MAC
172.16.2.12 Non-Malicious Normal Src/Dst MAC
172.16.2.12 Malicious/ Zeus Src/Dst MAC CC:CC:CC:CC:CC:CC
172.16.2.12 Malicious/ Zeus (C & C) Src/Dst MAC
CC:CC:CC:DD:DD:DD
172.16.2.13 Non-Malicious Normal Src/Dst MAC
172.16.2.14 Non-Malicious Normal Src/Dst MAC
172.16.2.111 Non-Malicious Normal Src/Dst MAC
172.16.2.112 Non-Malicious Normal Src/Dst MAC
172.16.2.113 Non-Malicious Normal Src/Dst MAC
172.16.2.114 Non-Malicious Normal Src/Dst MAC

It is assumed that all the traffic from the LBNL is non-malicious. Table 2 lists the IPs of
the machines in the subnet 172.16.0.0/16 that generate malicious and non-malicious
traffic and Table 3 provides some statistics about the unique flows in the dataset. In
addition to our labeling, the traffic from Traffic Lab at Ericsson Research in Hungary is
labeled to the level of the flow type, such as HTTP, SMTP, FTP, and etc., which does not
provide any malicious traffic. Using the combination of these traffic sets, we simulate the
behavior of a real world bot infected subnet to the best of our ability while at the same
time taking advantage of existing, well labeled data which we use for training and
evaluation purposes.

Table : Total number of unique malicious and non-malicious flows.

Unique Flows
Malicious 55,904 (3.33%)
Non-malicious 1,619,520 (96.66%)
Total 1,675,424 (100%)

References
[1] French Chapter of Honenynet http://www.honeynet.org/chapters/france
[2] G. Szab´o, D. Orincsay, S. Malomsoky, and I. Szab´o, “On the validation of traffic classification
algorithms,” in Proceedings of the 9th international conference on Passive and active network
measurement, PAM’08, (Berlin, Heidelberg), pp. 72–81, Springer-Verlag, 2008.
[3] LBNL Enterprise Trace Repository. [Online] 2005. http://www.icir.org/enterprise-tracing.

To Reference this dataset use:

“Sherif Saad, Issa Traore, Ali A. Ghorbani, Bassam Sayed, David Zhao, Wei Lu, John
Felix, Payman Hakimian, "Detecting P2P botnets through network behavior analysis and
machine learning", Proceedings of 9th Annual Conference on Privacy, Security and Trust
(PST2011), July 19-21, 2011, Montreal, Quebec, Canada”

Nokia Commands
0% (1)
Nokia Commands
4 pages
Hacking Connected Cars: Tactics, Techniques, and Procedures
From Everand
Hacking Connected Cars: Tactics, Techniques, and Procedures
Alissa Knight
No ratings yet
Serial Port Complete: COM Ports, USB Virtual COM Ports, and Ports for Embedded Systems
From Everand
Serial Port Complete: COM Ports, USB Virtual COM Ports, and Ports for Embedded Systems
Jan Axelson
3.5/5 (9)
Exams INWK 6113
No ratings yet
Exams INWK 6113
102 pages
A Practical Guide Wireshark Forensics
From Everand
A Practical Guide Wireshark Forensics
alasdair gilchrist
5/5 (4)
Signature-Based Botnet Detection and Prevention: Sunny Behal, Amanpreet Singh Brar, Krishan Kumar
No ratings yet
Signature-Based Botnet Detection and Prevention: Sunny Behal, Amanpreet Singh Brar, Krishan Kumar
6 pages
You Press 'Enter' on the Browser: What happens when..., #1
From Everand
You Press 'Enter' on the Browser: What happens when..., #1
Dustin W. Morris
5/5 (1)
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
Malware On Wireless Networks
No ratings yet
Malware On Wireless Networks
10 pages
Lab Manual Cyber Security Workshop_Code BCS 453
No ratings yet
Lab Manual Cyber Security Workshop_Code BCS 453
79 pages
01-Benes - Botnet Detection Based On Network Traffic Classification (2015)
No ratings yet
01-Benes - Botnet Detection Based On Network Traffic Classification (2015)
63 pages
Botnet Detection Based On Traffic Monitoring
No ratings yet
Botnet Detection Based On Traffic Monitoring
5 pages
ASNM-TUN Dataset: (HTTP://WWW - Fit.vutbr - CZ/ ihomoliak/asnm/ASNM-TUN - HTML)
No ratings yet
ASNM-TUN Dataset: (HTTP://WWW - Fit.vutbr - CZ/ ihomoliak/asnm/ASNM-TUN - HTML)
2 pages
Network Engineering - The Essential Handbook
From Everand
Network Engineering - The Essential Handbook
W.J Bickerstaffe
No ratings yet
Machine Learning Based Fileless Malware Traffic Classification Using Image Visualization
No ratings yet
Machine Learning Based Fileless Malware Traffic Classification Using Image Visualization
18 pages
Threat Hunting Via Network Traffic Analysis!
No ratings yet
Threat Hunting Via Network Traffic Analysis!
61 pages
NetFlow
No ratings yet
NetFlow
11 pages
Anomaly Detection Using Machine Learning
No ratings yet
Anomaly Detection Using Machine Learning
4 pages
Network Security: Experiment of Network Health Analysis at An ISP
No ratings yet
Network Security: Experiment of Network Health Analysis at An ISP
10 pages
Abstract: in Recent Years, Botnets Have Become One of The Major Threats To
No ratings yet
Abstract: in Recent Years, Botnets Have Become One of The Major Threats To
7 pages
ETHICAL HACKING GUIDE-Part 2: Comprehensive Guide to Ethical Hacking world
From Everand
ETHICAL HACKING GUIDE-Part 2: Comprehensive Guide to Ethical Hacking world
Poonam Devi
No ratings yet
Botnetdetect Dns PDF
No ratings yet
Botnetdetect Dns PDF
8 pages
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
From Everand
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
Mulayam Singh
No ratings yet
Study of Certain Parameters Necessary For Detecting Malicious Users
No ratings yet
Study of Certain Parameters Necessary For Detecting Malicious Users
13 pages
On Design and Evaluation of "Intention-Driven" ICMP Traceback
No ratings yet
On Design and Evaluation of "Intention-Driven" ICMP Traceback
7 pages
CCNA Interview Questions You'll Most Likely Be Asked
From Everand
CCNA Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Android Malware Detection With Different IP Coding Methods
No ratings yet
Android Malware Detection With Different IP Coding Methods
9 pages
H@Dfex 2015 - Malware Analysis
No ratings yet
H@Dfex 2015 - Malware Analysis
54 pages
Ben PDF
No ratings yet
Ben PDF
87 pages
Lab Manual Cyber Security Workshop (BCS453)
No ratings yet
Lab Manual Cyber Security Workshop (BCS453)
76 pages
Electronics: Exploring Malware Behavior of Webpages Using Machine Learning Technique: An Empirical Study
No ratings yet
Electronics: Exploring Malware Behavior of Webpages Using Machine Learning Technique: An Empirical Study
20 pages
Mal Ware Forensics
No ratings yet
Mal Ware Forensics
19 pages
Computer Networking: An introductory guide for complete beginners: Computer Networking, #1
From Everand
Computer Networking: An introductory guide for complete beginners: Computer Networking, #1
Ramon Nastase
4.5/5 (2)
Implementation of Network Forensics Based On Honeypot
No ratings yet
Implementation of Network Forensics Based On Honeypot
6 pages
DNS Botnet
No ratings yet
DNS Botnet
7 pages
Varet COMPSAC2014
No ratings yet
Varet COMPSAC2014
7 pages
Top Networking Terms You Should Know
From Everand
Top Networking Terms You Should Know
JOHN SMITH
No ratings yet
Network 02 00036 v2
No ratings yet
Network 02 00036 v2
15 pages
Exposing Bot Attacks Using Machine Learning and Flow Level Analysis
No ratings yet
Exposing Bot Attacks Using Machine Learning and Flow Level Analysis
8 pages
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
No ratings yet
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
6 pages
the IOC
No ratings yet
the IOC
8 pages
Botgm Csnet
No ratings yet
Botgm Csnet
8 pages
An Analysis of Recurrent Neural Networks For Botnet Detection Behavior
No ratings yet
An Analysis of Recurrent Neural Networks For Botnet Detection Behavior
6 pages
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
From Everand
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
Mamta Devi
No ratings yet
Research Paper Fo GPU Virtualization
No ratings yet
Research Paper Fo GPU Virtualization
9 pages
The Rise of Ripple and XRP: Digital Assets, #1
From Everand
The Rise of Ripple and XRP: Digital Assets, #1
Robert Pemberton
No ratings yet
Botnet Detection Techniques
No ratings yet
Botnet Detection Techniques
5 pages
Ijaiem 2013 09 26 062 PDF
No ratings yet
Ijaiem 2013 09 26 062 PDF
5 pages
Sas
No ratings yet
Sas
17 pages
A Model For Malicious Website Detection Using Feed Forward Neural Network
No ratings yet
A Model For Malicious Website Detection Using Feed Forward Neural Network
9 pages
Hack into your Friends Computer
From Everand
Hack into your Friends Computer
Magelan Cyber Security
No ratings yet
Toward Generating A New Intrusion Detection Dataset and Intrusion Traffic Characterization
No ratings yet
Toward Generating A New Intrusion Detection Dataset and Intrusion Traffic Characterization
9 pages
Flow Dataset For Network Intrusion Detection
No ratings yet
Flow Dataset For Network Intrusion Detection
23 pages
CW File With Front Page
No ratings yet
CW File With Front Page
42 pages
Experiment 3
No ratings yet
Experiment 3
6 pages
Multilayer Framework For Botnet Detection Using Machine Learning Algorithms
No ratings yet
Multilayer Framework For Botnet Detection Using Machine Learning Algorithms
16 pages
Botnets Final
No ratings yet
Botnets Final
20 pages
Ad 1194755
No ratings yet
Ad 1194755
56 pages
Web Robot Detection Based On Pattern Matching Technique and Host and Network Based Analyzer and Detector For Botnets
No ratings yet
Web Robot Detection Based On Pattern Matching Technique and Host and Network Based Analyzer and Detector For Botnets
2 pages
Fulltext01 PDF
No ratings yet
Fulltext01 PDF
64 pages
Network Forensic
No ratings yet
Network Forensic
12 pages
Packet and Flow based IDS
No ratings yet
Packet and Flow based IDS
30 pages
Redistribution and Filtering
No ratings yet
Redistribution and Filtering
4 pages
Configure The First Leaf Switch Pair
No ratings yet
Configure The First Leaf Switch Pair
54 pages
Nokia 5G Standalone
No ratings yet
Nokia 5G Standalone
18 pages
System Design Karanpratapsingh
No ratings yet
System Design Karanpratapsingh
191 pages
Experiment 4
No ratings yet
Experiment 4
10 pages
IOT UNIT3
No ratings yet
IOT UNIT3
31 pages
Cisco - Technical VoIP - White Paper 09-29-2005
No ratings yet
Cisco - Technical VoIP - White Paper 09-29-2005
16 pages
SMC TigerStack SMC6248M
No ratings yet
SMC TigerStack SMC6248M
2 pages
BT81 Forum Wissenschaftsnetz Nokia-2
No ratings yet
BT81 Forum Wissenschaftsnetz Nokia-2
96 pages
3rd Generation System and Field Trials, Other Trial System
No ratings yet
3rd Generation System and Field Trials, Other Trial System
17 pages
3 FLSM
No ratings yet
3 FLSM
6 pages
X5C (Eznvr) : Flexible and Easy To Set Up
No ratings yet
X5C (Eznvr) : Flexible and Easy To Set Up
7 pages
Partner List - Nagpur
No ratings yet
Partner List - Nagpur
6 pages
Asynchronous Transfer Mode (ATM)
No ratings yet
Asynchronous Transfer Mode (ATM)
39 pages
Computer Science HSSC-I Rubrics
No ratings yet
Computer Science HSSC-I Rubrics
1 page
CP-UNR-4K664R8-V2: 64 Ch. H.265+ 4K Network Video Recorder
No ratings yet
CP-UNR-4K664R8-V2: 64 Ch. H.265+ 4K Network Video Recorder
5 pages
4 Wire E&M Voice Modules. 2 Wire FXS & FXO Voice Modules. Conference Voice Modules. Low Speed Data Module. 64K High Speed Data Modules
No ratings yet
4 Wire E&M Voice Modules. 2 Wire FXS & FXO Voice Modules. Conference Voice Modules. Low Speed Data Module. 64K High Speed Data Modules
20 pages
ICT 320 Unit 8
No ratings yet
ICT 320 Unit 8
15 pages
Router Default Password List
100% (17)
Router Default Password List
19 pages
TCP Dump
No ratings yet
TCP Dump
1 page
MPLS VPN PRACTICE LAB 1 PyNet Labs 1696626334
No ratings yet
MPLS VPN PRACTICE LAB 1 PyNet Labs 1696626334
20 pages
DG 1000 Connecting Multiple Gauges Quick Guide
No ratings yet
DG 1000 Connecting Multiple Gauges Quick Guide
5 pages
A-Level Presentation - 21 Network Protocols and Layers
No ratings yet
A-Level Presentation - 21 Network Protocols and Layers
36 pages
Web Design and Development
No ratings yet
Web Design and Development
46 pages
02 - ODN Material _ Planning _ Solution Introduction_20200528
No ratings yet
02 - ODN Material _ Planning _ Solution Introduction_20200528
65 pages
Networks (2)
No ratings yet
Networks (2)
64 pages
pfe ELT2_V15
No ratings yet
pfe ELT2_V15
156 pages
Radwin Terrawin™: Mmwave 60Ghz Mesh For Advanced Multi-Gigabit Broadband Services
No ratings yet
Radwin Terrawin™: Mmwave 60Ghz Mesh For Advanced Multi-Gigabit Broadband Services
2 pages

ISOT-Dataset-Overview-v0.5

Uploaded by

ISOT-Dataset-Overview-v0.5

Uploaded by

ISOT Dataset Overview

Table : LBNL datasets general information

In order to produce an experimental dataset with both malicious and non-malicious

Fig . Dataset merging process

IP Address Type of Traffic Generated Label of Malicious Traffic

Table : Total number of unique malicious and non-malicious flows.

To Reference this dataset use:

You might also like