0% found this document useful (0 votes)

2 views

DeepFed_ Federated Deep Learning for Intrusion Detection in Industrial Cyber–Physical Systems

The document presents DeepFed, a federated deep learning framework designed for intrusion detection in industrial cyber-physical systems (CPSs), addressing the challenges posed by insufficient high-quality attack examples. It combines a convolutional neural network and gated recurrent unit for effective detection, while ensuring data privacy through a secure communication protocol based on the Paillier cryptosystem. The proposed model demonstrates superior performance in identifying various cyber threats compared to existing methods.

Uploaded by

lawkar0101

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

DeepFed_ Federated Deep Learning for Intrusion Detection in Industrial Cyber–Physical Systems

Uploaded by

lawkar0101

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/351360140

DeepFed: Federated Deep Learning for Intrusion Detection in Industrial

Cyber-Physical Systems

Article in IEEE Transactions on Industrial Informatics · September 2020

DOI: 10.1109/TII.2020.3023430

CITATIONS READS
408 1,081

6 authors, including:

Beibei Li Jiarui Song

Sichuan University Kyushu University
76 PUBLICATIONS 1,722 CITATIONS 4 PUBLICATIONS 424 CITATIONS

SEE PROFILE SEE PROFILE

Tao Li

764 PUBLICATIONS 18,083 CITATIONS

SEE PROFILE

All content following this page was uploaded by Beibei Li on 06 May 2021.

The user has requested enhancement of the downloaded file.

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021 5615

DeepFed: Federated Deep Learning for Intrusion

Detection in Industrial Cyber–Physical Systems
Beibei Li , Member, IEEE, Yuhao Wu , Jiarui Song , Rongxing Lu , Senior Member, IEEE,
Tao Li , Member, IEEE, and Liang Zhao , Member, IEEE

Abstract—The rapid convergence of legacy industrial I. INTRODUCTION

infrastructures with intelligent networking and computing
NDUSTRIAL cyber–physical systems (CPSs) are gener-
technologies (e.g., 5G, software-defined networking, and
artificial intelligence), have dramatically increased the at-
tack surface of industrial cyber–physical systems (CPSs).
I ally referred to as large-scale, geographically-dispersed,
complex, and heterogeneous Internet-of-Things (IoT) in an
However, withstanding cyber threats to such large-scale, industrial context, such as smart grids, autonomous trans-
complex, and heterogeneous industrial CPSs has been
extremely challenging, due to the insufficiency of high- portation systems, and gas pipelining systems [1]–[3]. In-
quality attack examples. In this article, we propose a novel dustrial CPSs are encapsuled with intelligent networking and
federated deep learning scheme, named DeepFed, to de- computing technologies, such as 5G (and beyond), software-
tect cyber threats against industrial CPSs. Specifically, we defined networking (SDN), network function virtualization,
first design a new deep learning-based intrusion detection cloud computing, and artificial intelligence (AI), with exist-
model for industrial CPSs, by making use of a convolu-
tional neural network and a gated recurrent unit. Second, ing industrial control systems (ICSs), a general architecture
we develop a federated learning framework, allowing mul- of which is shown in Fig. 1. Industrial CPSs are envisioned
tiple industrial CPSs to collectively build a comprehensive to facilitate remote access, promote smart services, enable
intrusion detection model in a privacy-preserving way. Fur- big data analytics, and allow better provisioning of network
ther, a Paillier cryptosystem-based secure communication resources [4].
protocol is crafted to preserve the security and privacy of
model parameters through the training process. Extensive The benefits from industrial CPSs seem clear, but these ad-
experiments on a real industrial CPS dataset demonstrate vancements have not come without risk [5]–[7]. Legacy in-
the high effectiveness of the proposed DeepFed scheme in dustrial infrastructures have been implemented with poor secu-
detecting various types of cyber threats to industrial CPSs rity measures, leaving numerous potential vulnerabilities unat-
and the superiorities over state-of-the-art schemes. tended. The rapid fusion of advanced networking and computing
Index Terms—Data privacy, deep learning, federated technologies has dramatically expanded the threat landscape
learning, industrial cyber–physical system (CPS), intrusion by opening up new vulnerabilities that can be exploited across
detection. softwarized endpoints, networks, applications, and cloud ser-
vices. One high-profile security incident is the BlackEnergy
malware-based cyber assault on Ukraine’s power grid in De-
cember 2015 [8], where more than 30 power substations were
Manuscript received June 30, 2020; revised August 16, 2020; ac- switched OFF, and about 230 thousand people were left in
cepted August 28, 2020. Date of publication September 11, 2020; date dark for a period from one to six hours. Other notorious cyber
of current version May 3, 2021. This work was supported in part by the
National Natural Science Foundation of China under Grant U1736212, incidents associated with industrial CPSs include the Stuxnet
Grant U19A2068, Grant 61302161, and Grant 61972269, in part by on Iran’s nuclear power plant [9], VPNFilter on supervisory
the China Postdoctoral Science Foundation under Grant 2019TQ0217 control and data acquisition (SCADA) protocols [10], unautho-
and Grant 2020M673277, in part by the Provincial Key Research and
Development Program of Sichuan under Grant 2020YFG0133, in part rized penetration on Australia’s Maroochy sewage factory [11],
by the Fundamental Research Funds for the Central Universities under etc. Such incidents demonstrate that industrial CPSs are much
Grant YJ201933, in part by Doctoral Fund, Ministry of Education, China likely to remain ongoing targets of interest in the near future,
under Grant 20130181120076, and in part by the China International
Postdoctoral Exchange Fellowship Program (Talent-Introduction). Paper particularly by state-sponsored or affiliated actors. The impor-
no. TII-20-3189. (Corresponding author: Liang Zhao.) tance of cybersecurity in industrial CPSs are reinforced by the
Beibei Li, Yuhao Wu, Jiarui Song, Tao Li, and Liang Zhao are U.S. Department of Homeland Security in the 2016 ICS-CERT
with the College of Cybersecurity, Sichuan University, Chengdu
610065, China (e-mail: [email protected]; [email protected]. Annual Assessment Report [12], which remarked that “rapid
cn; [email protected]; [email protected]; zhaoliangjapan@ increases in the connectivity of operational technology through
scu.edu.cn). the Internet of Things raises new challenges for control systems
Rongxing Lu is with the Faculty of Computer Science, Univer-
sity of New Brunswick, Fredericton, NB E3B 5A3, Canada (e-mail: security,” and also by the U.S. Department of Commerce in the
[email protected]). NIST Guide to ICS security [13], stating that “cybersecurity is
Color versions of one or more of the figures in this article are available essential to the safe and reliable operation of modern industrial
online at https://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TII.2020.3023430 processes.”

1551-3203 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: SICHUAN UNIVERSITY. Downloaded on May 06,2021 at 13:19:50 UTC from IEEE Xplore. Restrictions apply.
5616 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

2) Second, a federated learning framework is developed,

which, on the one hand, enables building a comprehen-
sive intrusion detection model by taking advantage of
data resources from multiple industrial CPS owners (in
the same domain). On the other hand, this framework
supports data processing at each industrial CPS’s own
premise, allowing effective privacy preservation of data
resources.
3) Third, we craft a Paillier public-key cryptosystem-based
secure communication protocol for the developed fed-
erated learning framework, by which the security and
privacy of model parameters through the training process
can be well preserved.
The remainder of this article is organized as follows. In
Fig. 1. General architecture of industrial CPSs. Section II, we review the state-of-the-art studies on intrusion
detection schemes for industrial CPSs and federated learning-
based intrusion detection methods. In Section III, we introduce
the system model and threat model considered in this work. Sec-
The state-of-the-art literature has seen an increasing interest tion IV elaborates on the proposed DeepFed scheme. Section VI
in addressing cybersecurity issues pertaining to industrial CPSs, gives the performance evaluation. Finally, Section VI concludes
placing priorities on AI relevant intrusion detection schemes this article.
in recent years. For example, in 2019, Qiu et al. [14] devel-
oped a dueling deep Q-learning-based approach to mitigating II. RELATED WORK
cyber threats against safe communications in software-defined
industrial IoT. In early 2020, Ismail et al. [15] investigated In this section, we briefly review the state-of-the-art studies
electricity theft attacks in smart grid CPSs, and proposed a deep focusing on intrusion detection schemes for industrial CPSs as
learning-based intrusion detection system for such cyberattacks. well as federated learning-based intrusion detection methods.
More recent studies can be seen in Section II. Unfortunately,
most of the existing AI relevant intrusion detection schemes A. Intrusion Detection Schemes for Industrial CPSs
associated with industrial CPSs are developed on a strong as- Recent years have witnessed an increasing research interest
sumption that sufficient high-quality examples of cyberattacks in intrusion detection schemes in the context of industrial CPSs.
on industrial CPSs are always available, readily for the sys- For example, in 2018, Yang et al. [16] designed an approach
tem defender to build a desired intrusion detection model. In based on zone partition to detect both known and unknown
real-world scenarios, however, one industrial CPS owner usually cyberattacks for industrial CPSs, even when several zones are
has rather limited attack examples, making the model building compromised simultaneously. Also, Wang et al. [17] in 2018
work incredibly challenging. Further, industrial CPS owners proposed a stacked auto-encoder-based deep learning scheme to
are usually unwilling to share such attack examples (neither detect two-stage sparse cyberattacks against the ac state estima-
those normal behavior examples) to the third parties, because tion in smart grid CPSs. In 2019, Qiu et al. [14] developed a du-
highly-sensitive information about their critical industrial CPSs eling deep Q-learning-based approach to mitigate cyber threats
is always involved in these data resources. In such situations, against safe communications in software-defined industrial IoT.
we see that building a desired AI-based intrusion detection In the same year, Yang et al. [18] designed a convolutional
model for industrial CPSs is an apparently intractable task. neural network (CNN)-based intrusion detection system for
Specifically, we first design a novel deep learning model, based SCADA networks, in order to protect industrial CPSs from both
on CNN and gated recurrent unit (GRU), to detect various types conventional and SCADA specific network-based cyberattacks.
of cyber threats against industrial CPSs. In addition, we develop In early 2020, Ismail et al. [15] investigated electricity theft
a new federated learning framework for multiple industrial CPS attacks in smart grid CPSs and proposed a deep learning-based
owners to collectively build a comprehensive intrusion detection intrusion detection system for such cyberattacks. Also in 2020,
model in a privacy-preserving way. Moreover, we design a Liu et al. [19] presented a hierarchically distributed intrusion
secure communication protocol based on the Paillier public-key detection framework for the security monitoring of large-scale
cryptosystem to preserve the security and privacy of model industrial CPSs. It takes advantage of the security monitoring of
parameters through the training process. The main contributions physical systems and information systems to achieve all-round
of this work are threefold. security protection of industrial CPSs.
1) First, we create a novel deep learning-based intrusion
detection model for industrial CPSs, by making use of
B. Federated Learning-Based Intrusion
CNN and GRU. This model is highly effective in de-
Detection Methods
tecting various types of cyber threats against industrial
CPSs, such as denial-of-service (DoS), reconnaissance, Emerged as a promising tool for addressing data islands issues
response injection, and command injection attacks. in recent years, federated learning has been widely adopted

Authorized licensed use limited to: SICHUAN UNIVERSITY. Downloaded on May 06,2021 at 13:19:50 UTC from IEEE Xplore. Restrictions apply.
LI et al.: DEEPFED: FEDERATED DEEP LEARNING FOR INTRUSION DETECTION IN INDUSTRIAL CPS 5617

3) Industrial Agents: Each industrial agent, on behalf of the

industrial CPS owner, is in charge of building a local
intrusion detection model based on its own collected
industrial CPS data and aiding in updating the parameters
of the intrusion detection model by recurrently interacting
with the cloud server.

B. Threat Model
In the threat model, we consider cyber threats both targeting
the industrial CPSs and those aiming at our federated deep
learning framework.
1) Cyber Threats Against Industrial CPSs: Unlike tradi-
tional computer systems, industrial CPSs are being exposed
Fig. 2. System model under consideration. to not only traditional cyber threats, such as DoS and DDoS
attacks, but also a line of highly customized new cyber threats
in many areas. Particularly, a series of researchers have re- tailored to industrial systems, such as command injection and
cently conducted federated learning-based studies to achieve response injection attacks. In this article, we consider all the
intrusion detection. For example, in 2018, Preuveneers et al. abovementioned cyber threats, with a focus on the following.
[20] described a permissioned blockchain-based federated learn- a) Reconnaissance attacks are usually conducted for gather-
ing method to achieve an anomaly detection machine learning ing valuable information about industrial CPSs, mapping
model, where contributing parties in federated learning can be the network architectures, and identifying device features,
held accountable and have their model updates audited. In 2019, such as the manufacturer, model number, supported net-
Nguyen et al. [21] designed an autonomous self-learning dis- work protocols, and device addresses.
tributed system for detecting compromised IoT devices, which b) Response injection attacks are generally carried out to
employed a federated learning approach to achieve intrusion interfere with monitoring and reporting the state of a
detection. In the same year, Zhao et al. [22] proposed a multitask remote process in industrial CPSs. These attacks can
deep neural network in federated learning (MT-DNN-FL) to falsify responses reporting to querying parties, such that
perform network anomaly detection task. In 2020, Chen et al. biased system state information is provided.
[23] proposed a federated deep autoencoding Gaussian mixture c) Command injection attacks are launched often by in-
model (FDAGMM) to improve the disappointing performance jecting falsified control or configuration commands to
of traditional DAGMM in network anomaly detection caused by mislead system behaviors of industrial CPSs. Such attacks
limited data amount. can cause unauthorized modification of device configura-
tions, process setpoints, or communication destinations.
III. SYSTEM MODEL AND THREAT MODEL d) DoS attacks are mounted usually by flooding the targets
with superfluous requests in an extremely high frequency
In this section, we introduce the system model and threat
to exhaust the resources of server systems in industrial
model considered in this work.
CPSs, which can disrupt the services or prevent legitimate
requests from being fulfilled.
A. System Model 2) Cyber Threats Against Federated Learning Framework:
The system model under consideration is a federated deep In the considered federated deep learning framework, it is as-
learning framework (see Fig. 2), which mainly comprises three sumed that the trust authority is a fully trusted party, and the
types of entities, i.e., a trust authority, a cloud server, and K cloud server is a semihonest party who is honest in conducting
industrial agents. all the given tasks but curious about the model parameters of
1) Trust Authority: The trust authority undertakes the task of the intrusion detection model. Also, we assume that all indus-
bootstrapping the whole system, generating public keys trial agents are semihonest, who strictly follow the designed
and private keys for the Paillier public-key cryptosystem- protocols but may be interested in other agents’ data resources.
based secure communication protocol, as well as es- Further, it is also taken into consideration that malicious eaves-
tablishing secure communication channels for the cloud droppers or other external attackers may intercept with the
server and each industrial agent. communication links in an attempt to access both data resources
2) Cloud Server: The cloud server is responsible for building of each industrial CPSs and the parameters of the intrusion
a comprehensive intrusion detection model, by federating detection model. In this case, we consider the following two
the model parameters of those locally learned at each types of cyber threats.
industrial agent’s own premise. Multiple rounds of in- a) Eavesdropping of data resources: As for the industrial
teractions between the cloud server and each industrial CPS owners, their data resources for training the intrusion
agent are demanded in order to obtain a final “perfect” detection model, particularly for those attack examples,
intrusion detection model. are highly sensitive and even national critical. If shared

Authorized licensed use limited to: SICHUAN UNIVERSITY. Downloaded on May 06,2021 at 13:19:50 UTC from IEEE Xplore. Restrictions apply.
5618 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

with the cloud server, it may lead to considerable business

losses or severe national security risks.
b) Eavesdropping of model parameters: The parameters of
an intrusion detection model contain critical information
about the data resources. If they are accessed by the out-
side world in an unauthorized way, some basic knowledge
of such data resources, e.g., type of cyber threats or its
example distributions may possibly be leaked.

IV. PROPOSED DEEPFED SCHEME

In this section, we elaborate on the proposed DeepFed scheme
by outlining the scheme workflow first, and then introducing the
designed CNN-GRU-based intrusion detection model, followed
by the Paillier-based secure communication protocol.

A. Workflow of the DeepFed Scheme

The basic idea of the DeepFed scheme is networking multiple
industrial CPS owners to collectively build a deep learning intru-
sion detection model, based on a developed federated learning
framework along with a Paillier-based secure communication
protocol. The complete workflow of the DeepFed scheme can
be described in five phases, which is given below (see also
Algorithm 1 for the workflow).
1) System Initialization: In the system initialization phase,
the trust authority bootstraps the whole system by conduct-
ing KeyGenerate(κ) (see more details in Section IV-C), by
which the public key PK = {n, g} as well as the private key
SK = {λ, μ} used in the Paillier-based secure communication
protocol can be generated, and a secure channel between the
cloud server and each industrial agent is established. Then,
the cloud server selects an array of initial parameters w0 for
the deep learning-based intrusion detection model and some
other parameters relevant to the model training, i.e., the learning
rate η, exponential decay rates for moment estimates ρ1 , ρ2
∈ [0, 1), a small constant used for numerical stabilization ς,
loss function L, and batch size B. In addition, each industrial
agent Ak reports the size Nk of its own data resource Dk , to the
cloud server, where k ∈ K = {1, 2, · · · , K}, and then, the cloud
server computes a contribution ratio for each industrial agent
by αk = Nk / (N1 + N2 + · · · + NK ). Last, define a positive
integer R denoting the total rounds of communications between
the cloud server and an industrial agent.
2) Local Model Training by Industrial Agents: After receiv-
ing initial model parameters w0 as well as η, ρ1 , ρ2 , ς, L, B from
the cloud server, each industrial agent trains a deep learning-
r
based intrusion detection model locally, using their own private wk,T ) and j ∈ T = {1, 2, · · · , T }. Then, the encrypted param-
data resource Dk (k ∈ K). The detailed training procedure is r
eters {EP ai (wk,j )|j ∈ T } of the local deep learning model are
summarized in Algorithm 2. Since the local model training is then uploaded to the cloud server by each industrial agent, where
performed offline, it is assumed that sufficiently strong compu- T is the total number of parameters in a local deep learning
tational capabilities can be provided, so that there is no need model.
to care too much about the computational complexity of this 4) Model Parameters Aggregation by the Cloud Server:
algorithm. Given the contribution ratios and encrypted model parameters
3) Model Parameters Encryption by Industrial Agents: from all industrial agents, the cloud server aggregates them by
r r
When a local deep learning model is trained, each in- P araAggregate(EP ai (wk,1 ), · · · , EP ai (wk,T ), α1 , · · · , αK ).
dustrial agent Ak encrypts the model parameters wkr us- Then, the aggregated ciphertexts c = {cj |j ∈ T } are sent back
r
ing P araEncrypt(wk,j , PK), where wkr = (wk,1
r r
, wk,2 ,··· , to the industrial agents.

module, it regards x as a multivariate time series with a single

time step. Prior to delivering x to the GRU module, a dimension
shuffle layer is implemented, which transposes the temporal
dimension of the feature vector. It is given by x̃ = Shuﬄe(x).
Then, the GRU module processes x̃ in the following ways with
the purpose of extracting the temporal patterns:

h˜1 = GRU1 (x̃), and ν = GRU2 (h˜1 ) (1)

where GRUi , i ∈ {1, 2}, represents the ith GRU layer, h˜1 is a
hidden vector, and ν is the final output of the GRU module.
Fig. 3. Architecture of designed CNN-GRU model. When it comes to the CNN module, it treats x as a univariate
time series with multiple time steps
h1 = ConvBlock1 (x)
5) Local Model Updating by Industrial Agents: By decrypt-
h2 = ConvBlock2 (h1 )
ing the ciphertexts c using P araDecrypt(cj , SK(j ∈ T ), each
industrial agent obtains the updated model parameters w̃r . Then, h3 = ConvBlock3 (h2 )
the parameters of the local deep learning model are then updated
μ = Flatten (h3 ) (2)
by w̃r .
After R rounds (an empirically determined threshold) of where the ConvBlocki , i ∈ {1, 2, 3}, represents the ith convo-
interactions between the cloud server and industrial agents, a lutional block in the CNN module, h1 , h2 , h3 ∈ Rk are hidden
comprehensive deep learning-based intrusion detection model vectors. Then, the output of the three convolutional block is
can finally be obtained. As we can see from Algorithm 1, transferred to a flatten layer to be flattened, the result of which
each industrial agent Ak needs to conduct parameter encryption is μ. Following the CNN module and GRU module, μ and ν
and decryption tasks, which in total require T exponentiation are concatenated and then fed into the MLP module, which is
operations in Z∗n2 and a line of multiplication operations in Z∗n2 described by
(that can be relatively negligible) in each communication round.
In this way, the computational cost of each industrial agent Ak is c = Concate(μ, ν)
almost linearly proportionally to the total number of parameters h1 = FC1 (c)
T in a local deep learning model. As for the cloud server, it
only needs to perform K times of multiplication operations in h2 = FC2 (h1 )
Z∗n2 when aggregating all industrial agents’ model parameters τ = Dropout(h2 ) (3)
in each communication round.
where Concate represents the concatenation operation, c is
the concatenated result, FC1 and FC2 denotes the two fully
B. CNN-GRU-Based Intrusion Detection Model
connected layer, Dropout denotes the dropout layer. Moreover,
In this part, we introduce the newly designed CNN-GRU- h2 and τ are the output of the two fully connected layer and
based intrusion detection model. the dropout layer, respectively. At last, the softmax layer pro-
1) Model Architecture: The designed model is mainly com- vides the final classification result by ŷ = Softmax(τ ), where
posed of a CNN module and a GRU module, followed by an Softmax represents the softmax layer and y is the final classifi-
multilayer perceptron (MLP) module, and then a softmax layer cation result of the network traffic data.
(see Fig. 3), they are respectively described as below: Since the CNN-GRU model performs multiclassification to
a) CNN Module: The CNN module mainly involves three detect Γ types of attacks in industrial CPSs, the cross-entropy
convolutional blocks, and each convolutional block con- function is used as the loss function, which is defined by
sists of a convolutional layer, a batch normalization layer,
B−1 Γ−1
and a max-pooling layer. 1
L=− yi,j log ŷi,j (4)
b) GRU Module: The GRU module is composed of two B i=0 j=0
identical GRU layers.
c) MLP Module: The MLP module involves two fully con- where B denotes the batch size, yi,j is the true label, and ŷi,j
nected layers and a dropout layer (used to prevent the is the probability that the ith example is predicted to be the jth
model from overfitting). label.
d) Softmax Layer: The softmax layer is exploited to map the 2) Local Model Training: Each industrial agent Ak (k ∈ K)
nonnormalized output of the MLP module to a probability locally train the proposed deep learning model on their own
distribution over predicted classes. data resource Dk , with reference to Algorithm 2. Specifically,
Given a feature vector x (a one-dimensional vector denoting in the rth communication round, each industrial agent Ak first
the numerical features of a network traffic data example) being updates model parameters wkr based on the given updated model
the input of the designed model, the GRU module and CNN parameters w̃r . Then, using the same data resource Dk , indus-
module then process it, respectively. Specifically, as for the GRU trial agent Ak retrains the deep learning model based on the

Authorized licensed use limited to: SICHUAN UNIVERSITY. Downloaded on May 06,2021 at 13:19:50 UTC from IEEE Xplore. Restrictions apply.
5620 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

the model parameter using the public key PK by

EP ai (m) = g f (m) · rn mod n2 = g m · rn mod n2 . (5)
3) ParaAggregate(EP ai (m1 ), · · · , EP ai (mK ), α1 , · · · , αK ):
Given contribution ratios {α1 , α2 , · · · , αK } of each industrial
agent, the cloud server amplifies these ratios by 1000 times to
convert them as positive integers. With K model parameters
{EP ai (m1 ), EP ai (m2 ), · · · , EP ai (mK )} in hand, the cloud
server then aggregates these data by
K

c= EPαiai (mi )
i=1

= g α1 m1 r1α1 n · g α2 m2 r2α2 n · g αK mK rK
αK n
mod n2
K K

αi mi
=g i=1 · riαi n mod n2 . (6)
i=1

4) ParaDecrypt(c, SK): When receiving the ciphertext c of a

summed updated model parameter from the cloud server, each
industrial agent decrypts the summed updated model parameter
m̃sum by
m̃sum = L(c mod n2 ) · μ mod n
K
L(g i=1 αi mi · K αi n

i=1 ri mod n2 )
= λ
mod n
L(g mod n ) 2

optimizer adaptive moment estimation (Adam) that is able to K

facilitate the convergence of the loss function. = αi mi mod n. (7)
i=1

Then, compute the average value of the summed updated model

C. Paillier-Based Secure Communication Protocol
parameter by m̃ = m̃sum /1000. Recall that 1000 denotes a
In this part, we design a Paillier-based secure communication scalar used to convert the contribution ratios to a positive integer.
protocol for the developed federated learning framework. It Define a function ν = f −1 (ν ) = 10−8 · ν mod n. Considering
is worth noting that the advanced encryption standard (AES) that the original model parameters can either be positive (less
algorithm [24] is employed in our protocol to establish a secure than n/2 after the conversion by ν = f (ν)) or negative (larger
channel between the cloud server and each industrial agent, than n/2 after the conversion), we recover the updated model
which is helpful in mitigating malicious eavesdroppers and other parameter to the original scale by
external attackers. The Paillier cryptosystem [25], supporting
an unlimited number of homomorphic additions, is exploited in m̃ = f −1 (m̃ ), if m̃ < n2 ,
(8)
our protocol to achieve secure and privacy-preserving federated m̃ = f −1 (m̃ − n), otherwise.
learning over the cloud server. It is composed of the following
four functions.
V. PERFORMANCE EVALUATION
1) KeyGenerate(κ): Given a security parameter κ ∈ Z+ , the
trust authority generates the public key PK = (n, g) and the cor- In this section, we conduct extensive experiments to evalu-
responding private key SK = (λ, μ) as per the standard Paillier ate the performance of our proposed DeepFed scheme. First,
cryptosystem [25], where n is the product of two large prime we give the experiment settings, including the environmen-
numbers, g ∈ Z∗n2 is a generator, μ = (L(g λ mod n2 ))−1 mod tal setup, data resource description and partitioning, baseline
n, and function L is defined as L(α) = (α − 1)/n. Then, the studies, and performance metrics. Then, we carry out a series
trust authority publishes the PK and distributes SK = (λ, μ) of experiments to compare the performance of our proposed
to all the industrial agents. In addition, to establish a secure intrusion detection model with some state-of-the-art studies,
communication channel, the trust authority generates a sym- including the Schneble’s [26], Nguyen’s [21], and Chen’s [27],
metric key si for the cloud server and each industrial agent Ai , under our developed federated learning framework. In addi-
i ∈ {1, 2, · · · , K}, respectively. tion, we also compare the performance of the developed in-
2) ParaEncrypt(m, PK): Define a function ν = f (ν) = trusion detection model with those local intrusion detection
10 · ν mod n, and given a message m, compute m = f (m).
8 models built by each industrial agent as well as the ideal in-
In this way, each model parameter is converted to a positive trusion detection model built by a central entity on all data
integer m ∈ Zn . Select a random number r ∈ Z∗n and encrypt resources.

TABLE I
NUMERICAL RESULTS OF INTRUSION DETECTION MODELS WITH VARYING COMMUNICATION ROUNDS UNDER THREE DIFFERENT SCENARIOS

A. Experiment Settings the performance with our designed model under the proposed
federated learning framework.
1) Environmental Setup: The designed CNN-GRU model is
4) Performance Metrics: Four common metrics are used to
implemented using the Keras API1 and the federated learning
evaluate the performance of the detection model as follows.
framework is built by a lightweight Python framework Flask.2
a) Accuracy: The results of the model to predict the correct
Our experiments are conducted on a Ubuntu 18.04.3 LTS plat-
proportion.
form with an Intel Xeon E5-2618L v3 CPU and an NVIDIA
b) Precision: The proportion of examples identified as cy-
GeForce RTX 2080TI GPU (64GB RAM).
berattacks that are indeed cyberattacks.
2) Data Resource Description and Partitioning: We conduct
c) Recall: The proportion of all cyberattacks examples cor-
experiments on a real data resource of a gas pipelining system
rectly identified as exact types of cyberattacks.
(one significant example of industrial CPSs) [28]. In this data
d) F-score: The weighted average of precision and recall.
resource, one class of network data under normal operations
Note that, the macro averaged values are utilized to com-
and seven classes under various cyberattacks are, respectively,
prehensively evaluate the performance of all considered
collected. Each piece of network data in this data resource
intrusion detection models.
contains 26 features and 1 label. In our experiments, the data
resource is divided into two major parts, i.e., 80% for training
and 20% for testing, and the training part is further divided into B. Performance Comparison with
even partitions to each industrial agent for local model training. State-of-the-Art Studies
Note that all the trained deep learning models are tested on the We first conduct experiments to compare the performance
same testing data. of our proposed DeepFed scheme with the abovementioned
3) Baseline Studies: In this work, we compare the perfor- baseline studies [21], [26], [27]. Three groups of experiments are
mance of our proposed DeepFed scheme with some state-of- conducted, where different numbers of industrial agents K = 3,
the-art studies, where federated learning frameworks are also 5, and 7 are, respectively, considered.
used. Schneble et al. [26] proposed a single layer MLP-based Table I shows the numerical results about the performance of
federated learning framework for attack detection in medical federated intrusion detection models, in terms of the accuracy,
CPSs. Also, Nguyen et al. [21] presented a three-hidden-layer precision, recall, and F-score, under three different scenarios
GRU-based federated self-learning system for intrusion detec- with R = 2, 4, 6, 8, and 10, respectively. It can be easily seen
tion in IoT networks. Further, Chen et al. [27] utilized a CNN- that, the proposed intrusion detection model outperforms other
based federated framework for data classifications, which is state-of-the-art studies on all metrics. As the number of com-
composed of two convolutional layers, two max-pooling layers, munication rounds R increases from 1 to 10, the performance
two fully connected layers, and one softmax layer. We fully of each intrusion detection model generally improves, and grad-
reproduce these deep learning models in our work and compare ually stabilizes when R is sufficiently large. It’s worth noting
that, we can obtain an accuracy, precision, recall, F-score of
99.20%, 98.86%, 97.34%, and 98.08%, respectively, when K =
1 Keras: Python deep learning library (http://keras.io/). 3, 99.20%, 98.85%, 97.45%, and 98.13% when K = 5, and
2 Flask: Python web development framework (http://flask.pocoo.org/). 99.20%, 98.85%, 97.47%, 98.14% when K = 7, respectively,

Authorized licensed use limited to: SICHUAN UNIVERSITY. Downloaded on May 06,2021 at 13:19:50 UTC from IEEE Xplore. Restrictions apply.
5622 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

Fig. 4. Comparison of the accuracy and F-score of considered intrusion detection models with varying communication rounds under three different
scenarios.(a) Accuracy versus R (K = 3). (b) Accuracy versus R (K = 5). (c) Accuracy versus R (K = 7). (d) F-score versus R (K = 3). (e) F-score
versus R (K = 5). (f) F-score versus R (K = 7).

Fig. 5. Performance comparison of the local, ideal, and the proposed intrusion detection models under three different scenarios. (a) K = 3, R =
10. (b) K = 5, R = 10. (c) K = 7, R = 10.

with the communication round R = 10. Fig. 4 also visually as the performance of an ideal model built by a central entity
presents the numerical results of the accuracy and F-score of all using all the data resources. Fig. 5 shows the numerical results of
considered intrusion detection models with varying communi- all four metrics under the abovementioned local, ideal, and the
cation rounds, under K = 3, 5, and 7, respectively. It is clear that proposed intrusion detection models, respectively, with varying
all intrusion detection models tend to converge after sufficient settings of K. As we can see, all the local intrusion detection
rounds of communication with the cloud server. Importantly, models perform unsatisfactorily compared with the proposed
the proposed intrusion detection model has generally the best model. Importantly, we also observe that the proposed model
performance over other baselines. produces sufficiently good performance compared with the ideal
model. It is, therefore, worth noting that the proposed model
would be wise to all industrial CPS owners due to its high
C. Performance Comparison With performance in intrusion detection and the ability to preserve
Local and Ideal Models the privacy of their data resources.
In addition to the above experiments, we also carry out Furthermore, we also evaluate the performance of the local,
experiments to evaluate the performance of each locally built ideal, and our proposed models in detecting various types of
intrusion detection model using limited data resources as well cyber threats against industrial CPSs. The numerical results are

TABLE II
NUMERICAL RESULTS OF THE LOCAL, IDEAL, AND PROPOSED MODELS IN DETECTING VARIOUS TYPES OF CYBER THREATS (K = 5)

summarized in Table II (taking K = 5 as an example). As can [5] H. Bao, R. Lu, B. Li, and R. Deng, “BLITHE: Behavior rule based insider
be seen that the proposed intrusion detection model exhibits threat detection for smart grid,” IEEE Internet Things J., vol. 3, no. 2,
pp. 190–205, Apr. 2016.
excellent performance in terms of the precision, recall, and F- [6] M. Hao, H. Li, X. Luo, G. Xu, H. Yang, and S. Liu, “Efficient and privacy-
score while detecting multiple types of cyber threats against enhanced federated learning for industrial artificial intelligence,” IEEE
industrial CPSs, compared to a local model, and almost the same Trans. Ind. Informat., vol. 16, no. 10, pp. 6532–6542, Oct. 2019.
[7] B. Li, R. Lu, W. Wang, and K. R. Choo, “DDOA: A Dirichlet-based
performance compared to an ideal model. detection scheme for opportunistic attacks in smart grid cyber-physical
system,” IEEE Trans. Inf. Forensics Secur., vol. 11, no. 11, pp. 2415–2425,
Nov. 2016.
VI. CONCLUSION [8] K. Zetter, “Inside the cunning, unprecedented hack of Ukraine’s power
grid,” Wired, Mar. 2016. [Online]. Available: https://www.wired.com/
In this article, we proposed a federated deep learning scheme, 2016/03/inside-cunning-unprecedented-hack-ukraines-power-grid/
[9] N. Falliere, L. O. Murchu, and E. Chien, “W32. Stuxnet dossier,” Symantec
named DeepFed, for detecting and mitigating cyber threats Corp., Tempe, AZ, USA, White paper, vol. 5, Feb. 2011.
against industrial CPSs. First, we developed a novel federated [10] Z. Bederna and T. Szadeczky, “Cyber espionage through botnets,” Secur.
learning framework for multiple industrial CPSs, enabling the J., vol. 33, no. 1, pp. 43–62, Mar. 2020.
[11] N. Sayfayn and S. Madnick, “Cybersafety analysis of the Maroochy Shire
collective building of a comprehensive intrusion detection model sewage spill,” MIT Interdisciplinary Consortium for Improving Critical
in a privacy-preserving way. In addition, we created a novel Infrastructure Cybersecurity, MIT Management Sloan School, Cambridge,
CNN-GRU-based intrusion detection model, which allows ef- MA, USA, Working Paper CISL 2017-09, vol. 9, May 2017.
[12] J. Felker and M. Edwards, “ICS-CERT Annual Assessment Report,”
fective detection of various types of cyber threats against in- Industrial Control Systems Cyber Emergency Response Team, 2017,
dustrial CPSs. Further, a Paillier-based secure communication vol. S508C. [Online]. Available: https://www.us-cert.gov/sites/default/
protocol was designed for the federated learning framework, files/Annual-Reports/FY2016-Industrial-Control-Systems-Assessment-
Summary-Report-S508C.pdf
which effectively preserves the security and privacy of model [13] K. Stouffer, V. Pillitteri, S. Lightman, M. Abrams, and A. Hahn, “Guide
parameters in the training process. Extensive experiments on a to Industrial Control Systems (ICS) Security,” U.S. Department of Com-
real industrial CPS dataset demonstrated the high effectiveness merce, Washington, D.C., USA, NIST-800-82 (R2), May 2015. [On-
line]. Available: https:// nvlpubs.nist.gov/nistpubs/SpecialPublications/
of the proposed DeepFed scheme as well as the superiorities NIST.SP.800-82r2.pdf
over state-of-the-art schemes. [14] C. Qiu, F. R. Yu, H. Yao, C. Jiang, F. Xu, and C. Zhao, “Blockchain-based
It is worth noting that the proposed scheme builds a federated software-defined industrial Internet of Things: A dueling deep Q -learning
approach,” IEEE Internet Things J., vol. 6, no. 3, pp. 4627–4639, Jun. 2019.
intrusion detection model mainly for same-domain industrial [15] M. Ismail, M. F. Shaaban, M. Naidu, and E. Serpedin, “Deep learn-
CPSs. Future research directions will focus on addressing cy- ing detection of electricity theft cyber-attacks in renewable distributed
bersecurity issues by federating data resources from different- generation,” IEEE Trans. Smart Grid, vol. 11, no. 4, pp. 3428–3437,
Jul. 2020.
domain industrial CPSs. [16] J. Yang, C. Zhou, S. Yang, H. Xu, and B. Hu, “Anomaly detection based on
zone partition for security protection of industrial cyber-physical systems,”
IEEE Trans. Ind. Electron., vol. 65, no. 5, pp. 4257–4267, May 2018.
REFERENCES [17] H. Wang, J. Ruan, G. Wang, B. Zhou, Y. Liu, X. Fu, and J. Peng, “Deep
learning-based interval state estimation of AC smart grids against sparse
[1] C. Lu, et al., “Real-time wireless sensor-actuator networks for industrial cyber attacks,” IEEE Trans. Ind. Informat, vol. 14, no. 11, pp. 4766–4778,
cyber-physical systems,” in Proceedings of the IEEE, vol. 104, no. 5, Nov. 2018.
pp. 1013–1024, May 2016. [18] H. Yang, L. Cheng, and M. C. Chuah, “Deep-learning-based network
[2] Y. Lu, X. Huang, Y. Dai, S. Maharjan, and Y. Zhang, “Blockchain and intrusion detection for SCADA systems,” in Proc. IEEE Conf. Commun.
federated learning for privacy-preserved data sharing in industrial IoT,” Netw. Secur., Washington, DC, USA, Jun. 10–12, 2019, pp. 337–343.
IEEE Trans. Ind. Informat., vol. 16, no. 6, pp. 4177–4186, Jun. 2020. [19] J. Liu, W. Zhang, T. Ma, Z. Tang, Y. Xie, W. Gui, and J. P. Niyoyita,
[3] B. Li, R. Lu, W. Wang, and K.-K. R. Choo, “Distributed host-based collabo- “Toward security monitoring of industrial cyber-physical systems via
rative detection for false data injection attacks in smart grid cyber-physical hierarchically distributed intrusion detection,” Expert Syst. Appl., vol. 158,
system,” J. Parallel Distrib. Comput., vol. 103, pp. 32–41, May 2017. pp. 113 578–113 400, Nov. 2020.
[4] C. Chen, J. Yan, N. Lu, Y. Wang, X. Yang, and X. Guan, “Ubiquitous [20] D. Preuveneers, V. Rimmer, I. Tsingenopoulos, J. Spooren, W. Joosen,
monitoring for industrial cyber-physical systems over relay-assisted wire- and E. Ilie-Zudor, “Chained anomaly detection models for federated
less sensor networks,” IEEE Trans. Emerg. Topics Comput., vol. 3, no. 3, learning: An intrusion detection case study,” Appl. Sci., vol. 8, no. 12,
pp. 352–362, Sep. 2015. pp. 2663–2683, Dec. 2018.

Authorized licensed use limited to: SICHUAN UNIVERSITY. Downloaded on May 06,2021 at 13:19:50 UTC from IEEE Xplore. Restrictions apply.
5624 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 17, NO. 8, AUGUST 2021

[21] T. D. Nguyen, S. Marchal, M. Miettinen, H. Fereidooni, N. Asokan, and Jiarui Song is currently working toward the
A.-R. Sadeghi, “DÏoT: A federated self-learning anomaly detection system B.E. degree in cybersecurity with the College
for IoT,” in Proc. IEEE Int. Conf. Distrib. Comput. Syst., Dallas, TX, USA, of Cybersecurity, Sichuan University, Chengdu,
July7–10, 2019, pp. 756–767. China.
[22] Y. Zhao, J. Chen, D. Wu, J. Teng, and S. Yu, “Multi-task network anomaly She has authored or coauthored works in
detection using federated learning,” in Proc. 10th Int. Symp. Inf. Commun. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
Technol., Hanoi HaLong Bay, Vietnam, Dec.4–6, 2019, pp. 273–279. and IEEE Conference on Local Computer Net-
[23] Y. Chen, J. Zhang, and C. K. Yeo, “Network anomaly detection using works. Her current research interests include
federated deep autoencoding Gaussian mixture model,” in Proc. Int. Conf. intrusion detection, artificial intelligence, and so-
Mach. Learn. Netw., Paris, France, Dec.3–5, 2019, pp. 1–14. cial network analysis.
[24] M. J. Dworkin et al. Advanced Encryption Standard (AES), Nov. 2001, vol.
197. [Online]. Available: https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.
FIPS.197.pdf
[25] P. Paillier, “Public-key cryptosystems based on composite degree resid-
uosity classes,” in Proc. Int. Conf. Theory Appl. Cryptographic Techn.,
Cologne, Germany, Apr. 26–30, 1999, pp. 223–238.
[26] W. Schneble and G. Thamilarasu, “Attack detection using federated
learning in medical cyber-physical systems,” in Proc. Int. Conf. Comput. Rongxing Lu (Senior Member, IEEE) received
Commun. Netw., Valencia, Spain, Jul. 29–Aug. 1, 2019. the Ph.D. degree in cryptography from the
[27] Y. Chen, X. Qin, J. Wang, C. Yu, and W. Gao, “FedHealth: A federated Department of Electrical and Computer Engi-
transfer learning framework for wearable healthcare,” IEEE Intell. Syst., neering, University of Waterloo, Waterloo, ON,
vol. 35, no. 4, pp. 83–93, July–Aug. 2020. Canada, in 2012.
[28] T. Morris and W. Gao, “Industrial control system traffic data sets for He is an Associate Professor with the Faculty
intrusion detection research,” in Proc. Int. Conf. Critical Infrastruct. of Computer Science (FCS), University of New
Protection, Arlington, TX, USA, Mar.17–19, 2014, pp. 65–78. Brunswick (UNB), Fredericton, NB, Canada. Be-
fore that, he worked as an Assistant Profes-
sor with the School of Electrical and Electronic
Engineering, Nanyang Technological University
Beibei Li (Member, IEEE) received the B.E. (NTU), Singapore, from April 2013 to August 2016. He worked as a
(Hons.) degree in communication engineer- Postdoctoral Fellow with the University of Waterloo, from May 2012 to
ing from the Beijing University of Posts and April 2013.
Telecommunications, Beijing, China, in 2014, Dr. Lu is a Senior Member of IEEE Communications Society. He was
and the Ph.D. degree in cybersecurity from the the recipient of the most prestigious Governor Generals Gold Medal,
School of Electrical and Electronic Engineering, in 2012, the 8th IEEE Communications Society (ComSoc) Asia Pacific
Nanyang Technological University, Singapore, (AP) Outstanding Young Researcher Award, in 2013, and the 2016–17
in 2019. Excellence in Teaching Award, FCS, UNB. He currently serves as the
He is currently an Associate Professor with Vice-Chair (Conferences) of IEEE ComSoc CIS-TC.
the College of Cybersecurity, Sichuan Univer-
sity, Chengdu, China. He was invited as a Vis-
iting Researcher with the Faculty of Computer Science, University of
New Brunswick, Fredericton, Canada, from March to August 2018, and
also the research group of Networked Sensing and Control, College
of Control Science and Engineering, Zhejiang University, Hangzhou,
China, from February to April 2019. His has authored or coauthored Tao Li received the Ph.D. degree in com-
works in IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, puter science from the University of Electronic
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, ACM Transactions on Science and Technology of China, Chengdu,
Cyber-Physical Systems, IEEE INTERNET OF THINGS JOURNAL, IFAC China, in 1994.
Automatica, Information Sciences, IEEE ICC, and IEEE GLOBECOM, He is currently a Professor with the College
etc. His current research interests include several areas in security and of Cybersecurity, Sichuan University, Chengdu,
privacy issues on cyber-physical systems (e.g., smart grids, industrial China. He has authored or coauthored nearly
control systems, etc.), with a focus on intrusion detection techniques, 300 papers in IEEE, ACM, Chinese Science,
artificial intelligence, and applied cryptography. Science Bulletin, Natural Science Progress, and
Dr. Li is serving or has served as a Publicity Chair, Publication Co- other important journals and academic confer-
Chair, or a TPC member for several international conferences, including ences. His main research interests include im-
IEEE International Conference on Communications (ICC), IEEE Global mune computing, artificial immune systems, cloud computing, and cloud
Communications Conference (GLOBECOM), IEEE International Confer- storage.
ence on Computing, Networking and Communications (ICNC), IEEE In-
ternational Conference on Advanced Technologies for Communications
(ATC), and International Conference on Wireless Communications and
Signal Processing (WCSP).

Yuhao Wu is currently working toward the Liang Zhao (Member, IEEE) received the M.S.
B.E. degree in cybersecurity with the College degree in computer science from Chongqing
of Cybersecurity, Sichuan University, Chengdu, University, Chongqing, China, in 2009, and the
China. Ph.D. degree in informatics from Kyushu Univer-
He has authored or coauthored several pa- sity, Fukuoka, Japan, in 2012.
pers in IEEE TRANSACTIONS ON INDUSTRIAL He is currently an Assistant Professor with the
INFORMATICS, Knowledge-Based Systems, and College of Cybersecurity, Sichuan University,
International Conference on Web Information Chengdu, China. He was a Visiting Researcher
Systems Engineering, etc. His research in- with the Surrey Center for Cyber Security of
terests include cyber–physical system secu- United Kingdom, from 2017–2018. His current
rity, online social network security, and artificial research focuses on cryptography, in particular,
intelligence. provable security, verifiable (outsourced) computation, and postquantum
cryptography.

Authorized licensed use limited to: SICHUAN UNIVERSITY. Downloaded on May 06,2021 at 13:19:50 UTC from IEEE Xplore. Restrictions apply.
View publication stats