A Synthesized Dataset For Cybersecurity Study of IEC 61850 Based Substation
A Synthesized Dataset For Cybersecurity Study of IEC 61850 Based Substation
Abstract—Cyber attacks pose a major threat to smart grid bias. Others have developed their own datasets which are
infrastructures where communication links bind physical devices non-standard and unpublished. The difficulties in developing
to provide critical measurement, protection, and control function- standard datasets can be attributed to the gap in knowledge
alities. Substation is an integral part of a power system. Modern
substations with intelligent electronic devices and remote access across various domains (e.g., the domain knowledge required
interface are more prone to cyber attacks. Hence, there is an to understand the various functionalities and operating modes
urgent need to consider cybersecurity at the electrical substation of a substation and the cybersecurity knowledge is needed
level. This paper makes a systematic effort to develop a syn- about the evolving threats and attack techniques), as well as
thesized dataset focusing on IEC 61850 GOOSE communication the absence of standard communication protocols in electrical
that is essential for automation and protection in smart grid.
The dataset is intended to facilitate the research community to substations prior to the advent of IEC 61850 [6]. Consequently,
study the cybersecurity of substations. We present the physical key challenges faced by cybersecurity community and indus-
system of a typical distribution level substation and several of its trial practitioners are low reproducibility, comparability and
critical electrical protection operation scenarios under different peer validated research. These factors motivate us to develop
disturbances, followed by several cyber-attack scenarios. We have a benchmark dataset for electrical substation cybersecurity
generated a dataset with multiple traces that correspond to these
scenarios and demonstrated how the dataset can be used to study. Though there are intrusion datasets for generic computer
support substation cybersecurity research. network system [7] and smart grid [8], our benchmark dataset
will facilitate study on cyber attacks specifically on substa-
I. I NTRODUCTION tions. Opportunities that we leverage to conduct this work
The complexity of power system is growing day by day with are enhanced cross-domain communication, systematization of
the inclusion of smart devices and communication network. industrial control system attack and standard communication
Maintaining network security and resilience is becoming more protocol IEC 61850 that deals with heterogeneity of IEDs from
challenging as the smart grid is vulnerable to cyber attacks. different vendors. By specifying the workflow for creating a
A cyber attack can intrude the computer network system, dataset, we aim at providing a realistic benchmark synthesized
thus compromising its authenticity, integrity, and availability dataset for evaluation of cybersecurity solutions such as the
[1]. As a consequence, the operation of a modern power IDS and false data detector.
grid, where large communication network and infrastructure
is prevalent, can be jeopardised. In recent years, extensive II. R ELATED W ORK
research works have been carried out to study possible cyber To enable the cybersecurity study of industrial networks,
threat scenarios and their mitigation strategies for power several testbeds have been developed to capture network
grids [2], [3], [4], [5]. traffic for experimentation and testing purposes. These testbeds
Although most research efforts focus on security aspects of include the ”Geek Lounge Lab” deployed at 4SICS confer-
the whole power grid, not many publications concentrate on ence [9] and the Electric Power Intelligent Control (EPIC)
electrical substations. Substations are important entities in the testbed in Singapore University of Technology and De-
power grid, both at transmission level and distribution level. sign [10]. Unfortunately, the 4SICS testbed does not support
The primary function of a substation is to convert one voltage IEC 61850-based traffic, which is widely used in smart grid
level to another and host communication among devices in the systems. Although the EPIC dataset [11] captures Manu-
station, bay and process levels. The advanced communication facturing Message Specification (MMS) messages, it does
needs increase the attack surface of the substation, potentially not contain GOOSE messaging which is the focus in this
enhancing the probability and quantum of cyber attacks. To paper as GOOSE communication is prevalent for automated
detect these attacks, many cybersecurity solutions such as protection and control in modernized substations. Furthermore,
the intrusion detection systems (IDS) have been proposed. the dataset does not contain attack traces. Other popularly
However, there is currently no realistic substation dataset that used datasets for evaluating network-based IDS include KDD-
can be used to validate those. Cup99 [12], NSL-KDD [13], UNSW-NB15 [7], and CI-
At present, many researchers adopt heuristics reasoning as CIDS2017 [14]. However, these datasets are better suited to
a mean of analyzing the security, which may lead to certain modeling intrusions in traditional computer networks. They
k,(((
,(((,QWHUQDWLRQDO&RQIHUHQFHRQ&RPPXQLFDWLRQV&RQWURODQG&RPSXWLQJ7HFKQRORJLHVIRU6PDUW*ULGV6PDUW*ULG&RPP
N966 N966
6XEVWDWLRQERXQGDU\N966
/HJHQG
&%&LUFXLWEUHDNHU
/,(' /,(' 777UDQVIRUPHUV
;,(',QWHOOLJHQW(OHFWURQLF'HYLFH
;//LQH77UDQVIRUPHU%%XV
8)8QGHUIUHTXHQF\
N9EXV &% &% N9EXV
,(&*226(
/,(' 8),(' /,('
,(&*226(
Fig. 1. Substation one-line diagram. The prefixes to IEDs denote - ’L’ for line, ’T’ for transformer and ’B’ for bus.
bay level, all the Intelligent Electronic Devices (IEDs) are • Alternately, incomer line IEDs (LIED30 and LIED40) can
interconnected via Ethernet LAN cables and communicate relay the information to the under frequency IED over
among themselves using IEC 61850 GOOSE protocol. It GOOSE (when there is no direct VT input to UFIED).
is worthwhile to note that IEC 61850 standard [6] defines • UFIED triggers a trip over GOOSE to the least priority
communication protocols for IEDs in substations. A num- (Priority 6) consumer first via LIED43. Then, CB-43 is
ber of protocols (such as GOOSE, MMS and SV) can be tripped at first stage.
mapped to the abstract data models defined in this standard. • After a time delay (usually 2-4 seconds) if the frequency
In IEC 61850 GOOSE communication, data (status, alarms, is not stabilized at the desired value, further loads are
measurements) of any format is grouped into a data set and shed. The sequence of tripping goes as: Priority 6 (CB-
transmitted over the electrical substation network in a fast and 43) → Priority 5 (CB-42) → Priority 4 (CB-32) . . .
reliable manner. The connection between the IEDs and Human • Between each two stages of tripping there is a time delay
Machine Interface (HMI) at station level is usually established of 2-4 seconds. The trip is initiated via GOOSE commu-
by LAN via the station server. IEC 61850 utilizes MMS, nication to respective IEDs such as LIED43, LIED42,
an international standard (ISO 9506) dealing with messaging LIED32, etc.
systems for transferring real time process data and supervisory
control information between network devices (such as IEDs) IV. C YBER ATTACK S CENARIOS
and station server. In this section, we define the threat model, the GOOSE
C. Attack-Free Scenarios in a Substation communication model and discuss some attacks encountered
by the substation.
Besides the communication under the normal operation, we
consider 3 representative disturbance scenarios under which A. Threat Model and Assumptions
substation protection system operates. These are not attacks, The engineering and operator workstations are HMIs de-
but abnormal operation due to faults in the power system. ployed in a substation to provide users with a graphical user
Under normal operation, these scenarios do not exist. While interface for monitoring and controlling devices. While access
describing the benchmark datasets (presented in Section VI), to the operator workstations is restricted to authorized person-
we present the attack-free normal operation dataset and attack- nel locally, the engineering workstations may be equipped with
free selected abnormal scenario datasets separately. remote access capabilities that allow access from locations
1. Busbar protection outside the substation network, i.e., corporate offices and
• Let us consider that a fault occurs at busbar 66kV bus-1.
control centers. As a result, they become soft targets and
The incomer line LIED10 will pick up on overcurrent, easy entry points for attackers to infiltrate into the substation.
the other IEDs wont. External attackers can use social engineering techniques or
• The incomer line LIED10 will know through GOOSE
phishing emails to compromise the engineering workstation
communication that the overcurrent elements of other and gain a foothold in the substation network, just like the
IEDs have not picked up. Ukraine case [18]. Once inside the network, the attackers
• The incomer line LIED10 will quickly realise of busbar
can gather knowledge about the substation’s topology and the
fault and trigger a trip to its own breaker CB-10 first and IEDs’ operations, including the IEDs’ login credentials. With
subsequently to the breakers associated with the busbars this information, the attackers can launch attacks on any IED
i.e., CB-11, CB-12 and CB-13. of their choice, assuming that both the engineering workstation
• The trip status of CB-10 is sent by LIED10 through
and the IEDs are connected to the same LAN and that no
GOOSE communication to LIED11, LIED12 and VLAN membership is configured on the switch.
TIED13 to trigger trip for their respective breakers.
2. Breaker failure protection: B. Attack Model Against GOOSE Communication
• Assume that a fault occurs in the feeder connecting In mentioning about communication models, of particular
substation S/S 3-1. The associated LIED11 overcurrent interest is the GOOSE protocol as it is the key driver for
(O/C) element picks up, however, the breaker CB-11 does automated control and protection. Under the threat model
not trip due to mechanical failure. discussed in the previous subsection, an attacker can com-
• The GOOSE communication of breaker failure and O/C promise the GOOSE communication to negatively impact the
element pick-up is sent from LIED11 to the LIED10 protection scheme described in Section III-C. In fact, the
(incomer), LIED12 and TIED13. GOOSE messaging is not encrypted for performance reasons.
• The communication triggers tripping of circuit breakers Thus, attackers can eavesdrop, analyze, and spoof GOOSE
CB-10, CB-12 and CB-13. Subsequently, the remote CB frames [19]. By spoofing, we mean that an attacker can
(in S/S 3-1) is tripped using proper communication media. masquerade as a legitimate IED to inject GOOSE frames.
3. Underfrequency load-shedding: In doing so, the attacker can send malicious GOOSE frames
• Under frequency IED (UFIED) can sense the under to various IEDs to cause damage, including modifying the
frequency in both 11kV buses when voltage transformer response messages sent from the IEDs to mislead the operators
(VT) inputs are directly given to this IED. about the actual state of the substation. The following is a
,(((,QWHUQDWLRQDO&RQIHUHQFHRQ&RPPXQLFDWLRQV&RQWURODQG&RPSXWLQJ7HFKQRORJLHVIRU6PDUW*ULGV6PDUW*ULG&RPP
TABLE II
D ESCRIPTION FOR EACH NETWORK TRACE IN OUR DATASET INCLUDING PRELIMINARY RESULTS OF SOME IDS ES RESULTS AGAINST EACH NETWORK
TRACE . – MEANS CANNOT DETECT; MEANS CAN DETECT
(e.g., the one described in [26]). A more advanced IDS which [7] N. Moustafa and J. Slay, “Unsw-nb15: a comprehensive data set for
can validate the reading by cross-checking among IEDs will network intrusion detection systems (unsw-nb15 network data set),”
in 2015 military communications and information systems conference
provide a better detection accuracy. Furthermore, variation in (MilCIS). IEEE, 2015, pp. 1–6.
load is a common phenomenon in power system. Network [8] V. Babu, R. Kumar, H. H. Nguyen, D. M. Nicol, K. Palani, and E. Reed,
“Melody: synthesized datasets for evaluating intrusion detection systems
loading can be anywhere between 0% to 100%. An IDS for the smart grid,” in 2017 Winter Simulation Conference (WSC). IEEE,
employing statistical approach or sometimes even knowledge- 2017, pp. 1061–1072.
based approach may trigger false alarms [30] under different [9] “Capture files from 4sics geek lounge,” April 2019. [Online]. Available:
https://www.netresec.com/?page=PCAP4SICS
loading conditions. A comprehensive dataset incorporating [10] S. Adepu, N. K. Kandasamy, and A. Mathur, “Epic: An electric power
multiple load levels and attack-free disturbance scenarios (e.g., testbed for research and training in cyber physical systems security,” in
as those in our BusbarProtection.pcapng, BreakFailure.pcapng, Computer Security. Springer, 2018, pp. 37–52.
[11] “EPIC dataset,” https://itrust.sutd.edu.sg/itrust-
VariableLoad.pcapng, etc. pcap traces) would help evaluate the labs datasets/dataset info/epic, April 2019.
performance and threshold for the IDS. [12] “Kdd cup 1999 data,” May 2019. [Online]. Available:
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
VIII. C ONCLUSION [13] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed
analysis of the kdd cup 99 data set,” in 2009 IEEE Symposium on Com-
In this paper, we discuss the design of IEC 61850 GOOSE putational Intelligence for Security and Defense Applications. IEEE,
network traffic traces that can be used to benchmark smart 2009, pp. 1–6.
grid security solutions, such as intrusion detection systems. We [14] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating
a new intrusion detection dataset and intrusion traffic characterization.”
elaborate a realistic substation system topology and operation in ICISSP, 2018, pp. 108–116.
models used as the basis for traffic generation. We further [15] O. Hegazi, E. Hammad, A. Farraj, and D. Kundur, “IEC-61850 GOOSE
discuss attack models against GOOSE-based communication, traffic modeling and generation,” in 2017 IEEE Global Conference on
Signal and Information Processing (GlobalSIP). IEEE, 2017, pp. 1100–
which are used for injecting attacks into the trace. The 1104.
generated traces are published online and in the future, we [16] Y. Lopes, D. C. Muchaluat-Saade, N. C. Fernandes, and M. Z. Fortes,
plan to test the generated attack traces on state-of-the-art IDS “Geese: A traffic generator for performance and security evaluation of
IEC 61850 networks,” in 2015 IEEE 24th International Symposium on
solutions available in the market as well as those developed in Industrial Electronics (ISIE). IEEE, 2015, pp. 687–692.
academia. Furthermore, we also intend to publish the toolchain [17] S. M. Blair, F. Coffele, C. D. Booth, and G. M. Burt, “An open platform
we have implemented for generation of such traces, so that for rapid-prototyping protection and control schemes with IEC 61850,”
IEEE Transactions on Power Delivery, vol. 28, no. 2, pp. 1103–1110,
researchers can utilize it to generate attack-free and/or attack- 2013.
induced traces that are of their interest. [18] Defence Use Case, “Analysis of the cyber attack on the ukrainian power
grid,” 2016.
ACKNOWLEDGEMENT [19] M. T. A. Rashid, S. Yussof, Y. Yusoff, and R. Ismail, “A review of
security attacks on IEC61850 substation automation system network,”
This research is supported in part by the National Research in Proceedings of the 6th International Conference on Information
Foundation, Prime Minister’s Office, Singapore under the Technology and Multimedia. IEEE, 2014, pp. 5–10.
Energy Programme and administrated by the Energy Market [20] J. Hoyos, M. Dehus, and T. X. Brown, “Exploiting the GOOSE protocol:
A practical attack on cyber-infrastructure,” in 2012 IEEE Globecom
Authority (EP Award No. NRF2017EWT-EP003-047), and in Workshops. IEEE, 2012, pp. 1508–1513.
part by the National Research Foundation, Prime Minister’s [21] “PowerWorld Simulator Overview,” April 2019. [Online]. Available:
Office, Singapore under its Campus for Research Excellence https://www.powerworld.com/products/simulator/overview
[22] “Power System Software Engineering,” April 2019. [Online]. Available:
and Technological Enterprise (CREATE) programme. https://www.digsilent.de/en/
[23] “Tcpreplay - Pcap editing and replaying utilities,” April 2019. [Online].
R EFERENCES Available: https://tcpreplay.appneta.com/
[1] R. Heady, G. Luger, A. Maccabe, and M. Servilla, “The architecture of [24] “GOOSE dataset,” May 2019. [Online]. Available:
a network level intrusion detection system,” Los Alamos National Lab., https://github.com/smartgridadsc/IEC61850SecurityDataset
NM (United States); New Mexico Univ., Albuquerque , Tech. Rep., [25] W. Ren, T. Yardley, and K. Nahrstedt, “Edmand: Edge-based multi-
1990. level anomaly detection for scada networks,” in 2018 IEEE International
[2] A. Gupta, A. Anpalagan, G. H. Carvalho, L. Guan, and I. Woungang, Conference on Communications, Control, and Computing Technologies
“Prevailing and emerging cyber threats and security practices in iot- for Smart Grids (SmartGridComm). IEEE, 2018, pp. 1–7.
enabled smart grids: A survey,” Journal of Network and Computer [26] Y. Yang, H.-Q. Xu, L. Gao, Y.-B. Yuan, K. McLaughlin, and S. Sezer,
Applications, vol. 132, pp. 118–148, 2019. “Multidimensional intrusion detection system for IEC 61850-based
[3] H. He and J. Yan, “Cyber-physical attacks and defences in the smart scada networks,” IEEE Transactions on Power Delivery, vol. 32, no. 2,
grid: a survey,” IET Cyber-Physical Systems: Theory & Applications, pp. 1068–1078, 2017.
vol. 1, no. 1, pp. 13–27, 2016. [27] J. Hong, C.-C. Liu, and M. Govindarasu, “Detection of cyber intrusions
[4] W. Wang and Z. Lu, “Cyber security in the smart grid: Survey and using network-based multicast messages for substation automation,” in
challenges,” Computer Networks, vol. 57, no. 5, pp. 1344–1371, 2013. ISGT 2014. IEEE, 2014, pp. 1–5.
[5] H. C. Tan, C. Cheh, B. Chen, and D. Mashima, “Tabulating cyberse- [28] ——, “Integrated anomaly detection for cyber security of the substa-
curity solutions for electrical substations towards pragmatic design and tions,” IEEE Transactions on Smart Grid, vol. 5, no. 4, pp. 1643–1653,
planning,” in presented at the IEEE Innovative Smart Grid Technologies- 2014.
Asia (ISGT Asia), Chengdu, China. IEEE, 2019. [29] M. Kabir-Querrec, S. Mocanu, P. Bellemain, J.-M. Thiriet, and E. Savary,
[6] “IEC 61850 - communication networks and systems in substations,” “Corrupted GOOSE detectors: Anomaly detection in power utility real-
May 2019. [Online]. Available: https://webstore.iec.ch/ time ethernet communications,” in GreHack 2015, 2015.
[30] E. Biermann, E. Cloete, and L. M. Venter, “A comparison of intrusion
detection systems,” Computers & Security, vol. 20, no. 8, pp. 676–683,
2001.