Industrial Control System-Anomaly Detection Dataset ICS-ADD for Cyber-Physical Security Monitoring in Smart Industry Environments
Industrial Control System-Anomaly Detection Dataset ICS-ADD for Cyber-Physical Security Monitoring in Smart Industry Environments
INDEX TERMS Industrial control system, smart industry, cybersecurity, open source, OSSIM, Suricata,
cyber-events dataset.
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
64140 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 12, 2024
G. B. Gaggero et al.: ICS-ADD for Cyber-Physical Security Monitoring
treatment and manufacturing where, given their crucial ScadaBR as Human Machine Interface (HMI). As presented
role, physical malfunctions or cyber attacks can result in in the next sections, also the testbed used in this paper exploits
severe consequences, ranging from alterations in network OpenPLC and ScadaBR and, in addition, some cybersecurity
traffic patterns to catastrophic incidents causing service loss, monitoring tools.
injuries, environmental pollution, and damage to equipment. The main characteristics of OpenPLC platform, includ-
In recent years, the concern for cybersecurity in ICSs has ing its compliance with the IEC-61131-3 Standard, are
escalated due to the extensive use of wireless networks and to described in [7]. The use of OpenPLC for cybersecurity
the exposure of industrial networks over the Internet. Despite research is supported by scientific literature. Reference [8]
advantages such as remote maintenance and streamlined reproduces a Cyber-Physical System by using Simulink for
machine adjustments there has been a significant rise of the process simulation, OpenPLC as PLC, and ScadaBR as
attacks to ICS networks. Consequently, there is the need HMI. Reference [9] proposes a real-time anomaly detection
of testbeds to evaluate the impact of cyber-physical attacks framework exploited on a testbed that links OpenPLC
to industrial processes and assess security countermeasures. with Graphical Realism Framework For Industrial Control
Commonly employed solutions to secure CPSs include Simulations (GRFICS), an open-source ICS simulation tool.
Intrusion Detection Systems (IDS) and Intrusion Prevention Reference [10] simulates various cyber-attacks, such as
Systems (IPS), both for network (NIDS) and host moni- Remote Scanning, False Data Injection (FDI), and Man in
toring (HIDS). Recent scientific literature has increasingly The Middle (MiTM), in a physical canal testbed. The control
delved into areas such as Artificial Intelligence (AI) and action in [10] was conducted by using OpenPLC, and the
Machine Learning (ML) for IDSs and IPSs, which have HMI was based on ScadaBR. Using open source platforms,
shown particular efficacy for the identification of unforeseen which simulate an industrial environment including PLC
attacks [1], [2], [3]. A critical aspect is the performance functionality allows [10] to test and evaluate innovative
evaluation of security monitoring systems to gauge their detection methods. Reference [11] proposes an automatic
ability to detect attacks: they require realistic and sufficiently whitelist generation method for secured PLC. Reference [12]
complex cyber-events datasets. is aimed at making PLC communication more resilient by
The aim of this paper is to present ICS-Anomaly Detection adding encryption.
Dataset: (ICS-ADD), a dataset to test ICS cyber-physical Nevertheless the wide research developed, few datasets
security monitoring. ICS-ADD contains all the phases of exploit the traffic captures and logs of the devices involved
an attack to an ICS, from reconnaissance to exploitation at the same time. In particular, it would be particularly
of devices, and it is composed of traffic captures and logs interesting to understand the role of open-source firewalls
generated by both control devices and security monitoring and intrusion detection systems in a complete cyberattack
systems. Due to its completeness it can be used in multiple chain. This paper addresses this issue by using a generic
ways: to show how open-source security monitoring systems ICS architecture that could be generalized to cover a variety
react to ICS attacks, to allow noticing flaws and to build of industrial applications. To the best of our knowledge no
higher-level correlation rules. The paper is structured as fol- published paper provides a dataset composed of both .pcap
lows. Section II analyzes the related works on event datasets files and logs generated by open-source monitoring systems.
for industrial control systems cybersecurity. Section III
presents the testbed used in this paper to generate ICS-ADD. III. SMART INDUSTRY TESTBED
Section IV explains the details of the attacks carried out on The testbed used to feed the ICS-ADD emulates a simple
the testbed. Section V presents the structure of ICS-ADD. typical industrial control system composed of four main
Section VII discusses possible usages of ICS-ADD. Finally, elements: SCADA, PLC, Firewall and Network Switch.
in Section VIII, conclusions are drawn. These devices are the core of the simulated process control.
In addition to control components, this paper adds two main
II. RELATED WORKS elements: a Network Intrusion Detection System (NIDS),
Testbeds and datasets have always had a fundamental role in which analyzes all the traffic of the testbed by using a
cybersecurity research, especially with the spread of Machine properly configured span port on the network switch, and a
Learning in the development of monitoring algorithms. SIEM (Security Information and Event Management), which
Reference [4], after providing an overview of ICS archi- collects the logs generated by the Firewall, SCADA and
tectures, communication protocols and cybersecurity issues, NIDS. A real photo and the overall network architecture of
presents and discusses a list of cybersecurity testbeds and the testbed are shown in Figure 1.
datasets. Reference [5] proposes a methodology to generate The previously mentioned elements have been
reliable anomaly detection datasets for ICS and presents the implemented as follows:
dataset Electra which is related to electric traction substations • Firewall: the firewall runs on a dedicated hardware
used in the railway industry. Reference [6] proposes a that has two network interfaces: from one side it is
collection of datasets for ICS research. The testbed in [6] connected to the external network, from the other side to
includes MATLAB & Simulink to simulate the physical the switch of the LAN. The used open-source software
system, OpenPLC as Distributed Control System (DCS) and is pfSense [13].
• SCADA: the SCADA runs on a virtual machine on a From the SCADA interface, it is possible to manually activate
rack server that hosts also the VM of the PLC. The used the pumps and to set the reference level of the higher tank so
software is open-source ScadaBR [14]. automatizing the process of pump activation. A scheme of
• PLC: the PLC runs on a virtual machine on the rack the Human Machine Interface (HMI) of the SCADA system
server that hosts also the VM of the SCADA. The used is shown in Figure 2.
software is open-source OpenPLC Runtime [15]. The SCADA communicates with the PLC through the
• SIEM: the SIEM runs on dedicated hardware that has Modbus/TCP protocol. The SCADA continuously sends the
three network interfaces: one is connected to the switch’s level setpoint to the PLC, therefore in the traffic it is possible
span port to receive all LAN traffic; the other two to identify a regular exchange of Modbus/TCP packets
are connected to regular physical ports of the switch between SCADA and PLC.
and are used to receive syslog logs and to access the
management interface, respectively. The used software IV. ATTACKS AGAINST THE TESTBED
is the open-source OSSIM [16] that consists of at This section describes the attacks carried out against the
least two instances: the server, where the SIEM engine testbed so to feed ICS-ADD and to make cyber-security
resides, and the sensor, which collects the configured tests more challenging. The system being attacked is in a
data. The server and sensor reside on the same hardware. static state, as shown in Figure 2. In particular, the level
• NIDS: the software used for NIDS is Suricata, an open- of the Bottom reservoir is 91 and the level of the Upper
source tool already installed in the OSSIM suite, which reservoir is 9. Both the Pump (activated by clicking on
runs as a service on the same machine as the SIEM. START FILLING on the HMI) and the Generator (activated
The switch manages a LAN whose block of IP addresses by clicking on START EMPTYING on the HMI) are off. The
is 192.168.1.0/24. It is configured with a span port that ScadaBR (acting as Modbus Master) requests the status of
replicates all the traffic that flows into the switch on that port, all configured pointers of the PLC (acting as Modbus Slave)
in order to feed the NIDS. The IP addresses of the involved every 500 [ms]. The attack sequence has been designed by
devices are shown in Figure 1. following the Cyber Kill Chain framework [17] and it is
The simulated action is a generic industrial process related summarized in Table 1.
to a water treatment plant composed of two tanks that can be The attack starts (step 0) with the establishment of a covert
replenished and emptied through the activation of two pumps. channel between the attacker and the network protected by
the firewall; this action can happen after a violation of a the communication between SCADA and PLC through a DoS
single device connected to the network, for example through attack on the server (step 6). Figure 3 shows the network
phishing emails or malicious USB devices connected to a PC. localization of all attacks. The following subsections describe
Then, the attacker proceeds with the reconnaissance phase each attack in detail.
by acquiring information about the network structure and the
available services through a port scanning action (step 1). A. C2C COVERT CHANNEL (STEP 0)
The final aim of the attacker is to permanently substitute Covert channels can be defined as any communication
the rightful SCADA and maliciously control the process. that violates security policy. In this specific case the aim
To achieve this goal the attacker tries different strategies is to bypass the firewall rules. The covert channel is
in temporal sequence: violating the SCADA through a established through a technique called DNS (Domain Name
password bruteforce attack (step 2); scanning Modbus packet System) tunneling. DNS-tunneling covert channel attacks
exchange between PLC and SCADA (step 3) opening the represent a sophisticated cybersecurity threat that leverages
door to the exploitation of Modbus protocol vulnerabilities; the DNS protocol to bypass traditional network security
creating a Man-in-the-Middle attack at the datalink layer measures. The DNS is an essential component of the
(step 4); sending fake commands to the PLC by exploiting the Internet’s infrastructure, translating human-readable domain
vulnerabilities of the Modbus/TCP protocol (step 5); denying names into the IP addresses required to locate and identify
ICS-ADD, with respect to [25], with makes use of a larger industry testbed dataset ICS-ADD introduced in this paper.
network that comprehends both the field controller, SCADA, The evaluation aims to assess the effectiveness, accuracy, and
firewall, and cybersecurity monitoring tools, so providing a responsiveness of these tools in identifying and reacting to
broader perspective on industrial control system networks. a variety of simulated cyber-attacks targeting ICS environ-
ments. Table 4 provides a detailed list of the attacks detected
VI. PERFORMANCE ANALYSIS OF THE OPEN-SOURCE or not by pfSense and Suricata.
MONITORING TOOLS The performance evaluation of OSSIM and Suricata by
This section presents a comprehensive evaluation of the using ICS-ADD highlights the strengths and limitations
performances of the use of open-source security monitoring of each tool in the context of ICS security monitoring.
tools (pfSense, OSSIM and Suricata) by using the smart PfSense only analyzed external communication of the
system. As discussed, all the traffic between the compromised point to develop correlation rules. In addition to NIDS alarms
machine and the attacker has been hidden through DNS and firewall logs, it is possible to integrate many logs on
tunnels. Since pfSense does not implement a deep packet SIEM platforms. The collection of logs collected from OT
inspection over the DNS traffic, it has been completely data sources on traditional IT SIEM is not so common but it
unable to identify the C2C communication. Therefore, simple has many advantages as deepened in [27].
tunneling may be effective to make the firewall fail in
detecting current attacks. VII. USAGE OF THE DATASET
Suricata has been activated as an Intrusion Detection ICS-ADD is a rich resource for a variety of applications for
service on OSSIM. The purpose is to test Suricata’s detection the cybersecurity in industrial control systems communities.
capabilities without any modification of the default rule set. Possible usages of the dataset include:
As presented in Section III the network configuration enables
Suricata to analyze all network traffic flowing through the 1) CYBERSECURITY RESEARCH
LAN (192.168.1.0/24) to which OpenPLC, ScadaBR and Researchers can use the dataset to study the behavior
the compromised PC are connected. Suricata gives a certain of industrial control systems under various cyber-attack
degree of flexibility by offering the possibility of defining scenarios to analyze attack patterns, to understand the
custom rules. In our case, the rules already implemented impact of different types of cyber threats on ICS,
in the software are maintained. In the proposed scenario, and to explore the effectiveness of existing security
Suricata only generates alarms for two attacks: Port Scanning protocols.
and DoS. To improve detection, new rules need to be
developed and tested to detect these attacks. ICS-ADD is 2) DEVELOPMENT AND TESTING OF SECURITY SOLUTIONS
important for the testing phase. Security solution developers can leverage the dataset to
No correlation rules were enabled on OSSIM. The test and refine Intrusion Detection Systems (IDS), Intrusion
implementation of a SIEM instead of a standalone NIDS tool Prevention Systems (IPS), and other cybersecurity tools.
is oriented towards the proposal of a complete smart industry ICS-ADD provides real-world scenarios that can help to
testbed. In the presented scenario, the SIEM is just a collector improve the accuracy and efficiency of threat detection
of logs and alarms without an engine, but it is a great starting algorithms.
3) TRAINING AND EDUCATION the performance analysis of OSSIM and Suricata by using
ICS-ADD can be useful as an educational tool for students ICS-ADD reveals potential areas of improvement for these
and professionals in cybersecurity training programs. It can tools, particularly in the context of detecting sophisticated
be used in practical exercises to teach the fundamentals of or novel attack vectors. Finally, the open-source nature
ICS security, threat analysis, and incident response strategies. of the dataset encourages collaboration and innovation within
the cybersecurity research community, paving the way for
4) BENCHMARKING AND PERFORMANCE EVALUATION the development of more resilient and effective security
Organizations can use ICS-ADD to benchmark the per- strategies. In conclusion, ICS-ADD not only seems useful
formance of their current security systems against known as a tool for current cybersecurity research and development
threats. This can help in identifying gaps in their security but may also establish an operational basis for future
posture and in making informed decisions on necessary advancements in the protection of industrial control systems
upgrades or changes. against evolving threats.
REFERENCES
5) MACHINE LEARNING MODEL DEVELOPMENT [1] S. Neupane, J. Ables, W. Anderson, S. Mittal, S. Rahimi, I. Banicescu,
Data scientists and machine learning engineers can utilize and M. Seale, ‘‘Explainable intrusion detection systems (X-IDS): A survey
ICS-ADD to develop and train machine learning models for of current methods, challenges, and opportunities,’’ IEEE Access, vol. 10,
pp. 112392–112415, 2022.
anomaly detection, threat prediction, and automated response [2] N. Moustafa, N. Koroniotis, M. Keshk, A. Y. Zomaya, and Z. Tari,
mechanisms. The diverse range of attacks and responses ‘‘Explainable intrusion detection for cyber defences in the Internet of
contained in the dataset provides a comprehensive basis to Things: Opportunities and solutions,’’ IEEE Commun. Surveys Tuts.,
vol. 25, no. 3, pp. 1775–1807, 3rd Quart., 2023.
train robust models. [3] F. F. Alruwaili, ‘‘Intrusion detection and prevention in industrial IoT:
ICS-ADD is a versatile tool that can support a wide range A technological survey,’’ in Proc. Int. Conf. Electr., Comput., Commun.
of activities aimed at improving the security and resilience Mechatronics Eng. (ICECCME), Oct. 2021, pp. 1–5.
[4] M. Conti, D. Donadel, and F. Turrin, ‘‘A survey on industrial control system
of industrial control systems against cyber threats. Table 5 testbeds and datasets for security research,’’ IEEE Commun. Surveys Tuts.,
resumes some suggestions better explained below. vol. 23, no. 4, pp. 2248–2294, 4th Quart., 2021.
The proposed dataset allows developing new effective [5] Á. L. P. Gómez, L. F. Maimó, A. H. Celdrán, F. J. G. Clemente, C. Cadenas
Sarmiento, C. J. D. C. Masa, and R. M. Nistal, ‘‘On the generation of
detection rules for IDS or custom correlation rules for SIEM. anomaly detection datasets in industrial control systems,’’ IEEE Access,
Concerning IDS, the main source is represented by the .pcap vol. 7, pp. 177460–177473, 2019.
file, while, concerning SIEM, all the information contained [6] M. E. Alim, J. Smalligan, and T. H. Morris, ‘‘A collection of datasets and
simulation frameworks for industrial control system research,’’ in Proc.
in the folder can be used. Many commercial SIEM platforms
SoutheastCon, 2023, pp. 96–103.
allow importing a .pcap file. This action allows to directly test [7] T. Alves and T. Morris, ‘‘OpenPLC: An IEC 61,131–3 compliant open
the rule set to detect cyber-attacks in the environment under source industrial controller for cyber security research,’’ Comput. Secur.,
analysis. The SIEM platform typically does not collect all vol. 78, pp. 364–379, Sep. 2018.
[8] H. A. Chattha, M. M. U. Rehman, G. Mustafa, A. Q. Khan, M. Abid,
traffic logs, but rather alarms and events generated by other and E. U. Haq, ‘‘Implementation of cyber-physical systems with modbus
cybersecurity tools such as IDS, Firewall, and EDR. In this communication for security studies,’’ in Proc. Int. Conf. Cyber Warfare
case, a recommendation is to analyze the .pcap file by using Secur. (ICCWS), Nov. 2021, pp. 45–50.
[9] C. Zheng, X. Wang, X. Luo, C. Fang, and J. He, ‘‘An OpenPLC-
the IDS engine and to import the resulting alarms into the based active real-time anomaly detection framework for industrial
SIEM. This approach enhances the attack detection on both control systems,’’ in Proc. China Autom. Congr. (CAC), Nov. 2022,
platforms by directly modifying the IDS ruleset and using pp. 5899–5904.
[10] M. E. Alim, S. R. Wright, and T. H. Morris, ‘‘A laboratory-scale canal
correlation rules that leverage all the information collected SCADA system testbed for cybersecurity research,’’ in Proc. 3rd IEEE
by the SIEM. Int. Conf. Trust, Privacy Secur. Intell. Syst. Appl. (TPS-ISA), Dec. 2021,
pp. 348–354.
[11] S. Fujita, K. Rata, A. Mochizuki, K. Sawada, S. Shin, and S. Hosokawa,
VIII. CONCLUSION ‘‘On experimental validation of whitelist auto-generation method for
The development and dissemination of the smart industry secured programmable logic controllers,’’ in Proc. 44th Annu. Conf. IEEE
Ind. Electron. Soc., Oct. 2018, pp. 2385–2390.
testbed dataset ICS-ADD would like to be a significant [12] T. Alves, R. Das, and T. Morris, ‘‘Embedding encryption and machine
step forward in the field of cyber-physical security for learning intrusion prevention systems on programmable logic con-
industrial control systems. ICS-ADD, featuring a wide array trollers,’’ IEEE Embedded Syst. Lett., vol. 10, no. 3, pp. 99–102,
Sep. 2018.
of simulated cyber-attacks and corresponding outputs from [13] Pfsense. Accessed: Feb. 1, 2024. [Online]. Available: https://www.
leading open-source security monitoring tools like OSSIM pfsense.org/
and Suricata, should provide a useful resource for the [14] Scadabr. Accessed: Feb. 1, 2024. [Online]. Available: https://github.
com/ScadaBR
cybersecurity community. Several key findings emerged
[15] Openplc Runtime. Accessed: Feb. 1, 2024. [Online]. Available:
from the activity with this dataset. First observation: the https://autonomylogic.com/docs/2-1-openplc-runtime-overview/
detailed traffic captures and tool outputs underscore the [16] Alienvault Ossim—The World’s Most Widely Used Open-source Siem.
complexity and variety of modern cyber threats in industrial Accessed: Feb. 1, 2024. [Online]. Available: https://cybersecurity.
att.com/products/ossim
environments, also highlighting the necessity for advanced [17] M. J. Assante and R. M. Lee, ‘‘The industrial control system cyber kill
and adaptable security mechanisms. Second observation: chain,’’ SANS Inst. InfoSec Reading Room, vol. 1, p. 24, Oct. 2015.
[18] Reversednshell—Github. Accessed: Feb. 1, 2024. [Online]. Available: ALESSANDRO ARMELLIN (Graduate Student
https://github.com/ahhh/Reverse_DNS_Shell Member, IEEE) received the master’s degree
[19] I. Frazão, P. H. Abreu, T. Cruz, H. Araújo, and P. Simões, ‘‘Denial of in electrical engineering from the University
service attacks: Detecting the frailties of machine learning algorithms of Genoa, in March 2021. He is currently
in the classification process,’’ in Critical Information Infrastructures pursuing the Ph.D. degree with the Satellite
Security, Kaunas, Lithuania. Springer, 2019, pp. 230–235. Communications and Heterogeneous Network-
[20] V. Kelli, P. Radoglou-Grammatikis, T. Lagkas, E. K. Markakis, and ing Laboratory (SCNL), University of Genoa.
P. Sarigiannidis, ‘‘Risk analysis of DNP3 attacks,’’ in Proc. IEEE Int. Conf. He collaborates with Iren S.p.A., that is a
Cyber Secur. Resilience (CSR), Jul. 2022, pp. 351–356.
relevant Italian energy company. His research
[21] A. Lemay and J. M. Fernandez, ‘‘Providing SCADA network data sets
interests include cybersecurity of industrial control
for intrusion detection research,’’ in Proc. 9th Workshop Cyber Secur.
Experimentation Test, 2016, pp. 1–8. systems, microgrids, and smart grids.
[22] T. Morris and W. Gao, ‘‘Industrial control system traffic data sets for
intrusion detection research,’’ in Critical Infrastructure Protection VIII,
Arlington, VA, USA. Springer, 2014, pp. 65–78.
[23] H. Huang, P. Wlazlo, A. Sahu, A. Walker, A. Goulart, K. Davis,
L. Swiler, T. Tarman, and E. Vugrin, ‘‘Dataset of port scanning attacks on GIANCARLO PORTOMAURO (Member, IEEE)
emulation testbed and hardware-in-the-loop testbed,’’ Tech. Rep., 2022, received the Laurea degree in computer science
doi: 10.21227/cva5-nd75. engineering from the University of Genoa, Italy,
[24] M. Teixeira, T. Salman, M. Zolanvari, R. Jain, N. Meskin, and M. Samaka, in 2002, and the Engineering degree, in 2002.
‘‘SCADA system testbed for cybersecurity research using machine Since May 2001, he has been with Italian
learning approach,’’ Future Internet, vol. 10, no. 8, p. 76, Aug. 2018. Consortium of Telecommunications (CNIT), Uni-
[25] L. Faramondi, F. Flammini, S. Guarino, and R. Setola, ‘‘A hardware-in-the- versity of Genoa Research Unit, as a SCNL
loop water distribution testbed dataset for cyber-physical security testing,’’ Research Staff. He is currently a Research Fellow
IEEE Access, vol. 9, pp. 122385–122396, 2021. with the Satellite Communications and Network-
[26] G. B. Gaggero and A. Armellin, ‘‘ICS-ADD—A smart industry testbed ing Laboratory (SCNL), University of Genoa. His
dataset for cyber-physical security monitoring testing,’’ Tech. Rep., 2024,
main research interests include emulation of on board satellite systems,
doi: 10.21227/4zht-tr07.
reliable quality of service satellite networks for multimedia applications,
[27] A. Armellin, G. B. Gaggero, A. Cattelino, L. Piana, S. Raggi, and
M. Marchese, ‘‘Integrating OT data in SIEM platforms: An energy software for satellite emulation, wide network systems, installation and
utility perspective,’’ in Proc. Int. Conf. Electr., Commun. Comput. Eng. training of video conference tools for remote training and instruments access,
(ICECCE), Dec. 2023, pp. 1–7. and design and realization of event simulators for heterogeneous packet
switching networks.
Open Access funding provided by ‘Università degli Studi di Genova’ within the CRUI CARE Agreement