Science of Cyber Security 1st Edition by Wenlian Lu, Kun Sun, Moti Yun, Feng Liu 3030891372 9783030891374
Science of Cyber Security 1st Edition by Wenlian Lu, Kun Sun, Moti Yun, Feng Liu 3030891372 9783030891374
https://ebookball.com/product/frontiers-in-cyber-security-1st-edition-
by-haomiao-yang-rongxing-lu-isbn-981999330x-9789819993307-17056/
https://ebookball.com/product/bayesian-network-structure-ensemble-
learning-1st-edition-by-feng-liu-fengzhan-tian-qiliang-
zhu-9783540738701-10416/
Science of Cyber Security SciSec 2022 Workshops 1st
edition by Kouichi Sakurai,Chunhua Su 9811977682
9789811977688
https://ebookball.com/product/science-of-cyber-security-
scisec-2022-workshops-1st-edition-by-kouichi-sakurai-chunhua-
su-9811977682-9789811977688-25744/
Founding Editors
Gerhard Goos
Karlsruhe Institute of Technology, Karlsruhe, Germany
Juris Hartmanis
Cornell University, Ithaca, NY, USA
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The third annual International Conference on Science of Cyber Security (SciSec 2021)
was held successfully online during August 13–15, 2021. The mission of SciSec is
to catalyze the research collaborations between the relevant scientific communities
and disciplines that should work together in exploring the foundational aspects of
cybersecurity. We believe that this collaboration is needed in order to deepen our under-
standing of, and build a firm foundation for, the emerging science of cybersecurity
discipline. SciSec is unique in appreciating the importance of multidisciplinary and
interdisciplinary broad research efforts towards the ultimate goal of a sound science of
cybersecurity, which attempts to deeply understand and systematize knowledge in the
field of security.
SciSec 2021 solicited high-quality, original research papers that could justifiably help
develop the science of cybersecurity. Topics of interest included, but were not limited
to, the following:
– Cybersecurity Dynamics
– Cybersecurity Metrics and Their Measurements
– First-principle Cybersecurity Modeling and Analysis (e.g., Dynamical Systems,
Control-Theoretic Modeling, Game-Theoretic Modeling)
– Cybersecurity Data Analytics
– Quantitative Risk Management for Cybersecurity
– Big Data for Cybersecurity
– Artificial Intelligence for Cybersecurity
– Machine Learning for Cybersecurity
– Economics Approaches for Cybersecurity
– Social Sciences Approaches for Cybersecurity
– Statistical Physics Approaches for Cybersecurity
– Complexity Sciences Approaches for Cybersecurity
– Experimental Cybersecurity
– Macroscopic Cybersecurity
– Statistics Approaches for Cybersecurity
– Human Factors for Cybersecurity
– Compositional Security
– Biology-inspired Approaches for Cybersecurity
– Synergistic Approaches for Cybersecurity
SciSec 2021 was hosted by the Fudan University, Shanghai, China. Due to the
intensification of the COVID-19 situation all around the world, SciSec 2021 was held
totally online through Tencent Conference and VooV Meeting. The Program Committee
selected 22 papers — 17 full papers and 5 poster papers — from a total of 50 submissions
for presentation at the conference. These papers cover the following subjects: detec-
tion for cybersecurity, machine learning for cybersecurity, and dynamics, network and
vi Preface
inference. We anticipate that the topics covered by the program in the future will be
more systematic and further diversified.
The Program Committee further selected the paper titled “Detecting Internet-
scale Surveillance Devices using RTSP Recessive Features” by Zhaoteng Yan, Zhi Li,
Wenping Bai, Nan Yu, Hongsong Zhu, and Limin Sun and the paper titled “Dismantling
Interdependent Networks Based on Supra-Laplacian Energy” by Wei Lin, Shuming
Zhou, Min Li, and Gaolin Chen for the Distinguished Paper Award. The conference
program also included four invited keynote talks: the first keynote titled “Layers of
Abstractions and Layers of Obstructions and the U2F” was delivered by Moti Yung,
Google and Columbia University, USA; the second keynote titled “Progresses and
Challenges in Federated Learning” was delivered by Gong Zhang, Huawei, China; the
third keynote titled “SARR: A Cybersecurity Metrics and Quantification Framework”
was delivered by Shouhuai Xu, University of Colorado Colorado Springs, USA; while
the fourth keynote was titled “Preliminary Exploration on Several Security Issues in AI”
and was delivered by Yugang Jiang, Fudan University, China. The conference program
presented a panel discussion on “Where are Cybersecurity Boundaries?”
We would like to thank all of the authors of the submitted papers for their interest
in SciSec 2021. We also would like to thank the reviewers, keynote speakers, and
participants for their contributions to the success of SciSec 2021. Our sincere gratitude
further goes to the Program Committee, the Publicity Committee, and the Organizing
Committee, for their hard work and great efforts throughout the entire process of
preparing and managing the event. Furthermore, we are grateful to Fudan University
for their generosity to enable free registration for attending SciSec 2021.
We hope that you will find the conference proceedings inspiring and that it will
further help you in finding opportunities for your future research.
Steering Committee
Guoping Jiang Nanjing University of Posts and Telecommunications, China
Feng Liu Institute of Information Engineering, Chinese Academy of
Sciences, China
Shouhuai Xu University of Colorado Colorado Springs, USA
Moti Yung Google and Columbia University, USA
Publicity Co-chairs
Habtamu Abie Norwegian Computing Center, Norway
Guen Chen University of Texas at San Antonio, USA
Noseong Park George Mason University, USA
Chunhua Su University of Aizu, Japan
Jia Xu Nanjing University of Posts and Telecommunications, China
Xiaofan Yang Chongqing University, China
Jeong Hyun Yi Soongsil University, South Korea
Lidong Zhai Institute of Information Engineering, Chinese Academy of
Sciences, China
James Zheng Macquarie University, Australia
Web Chair
Weixia Cai Institute of Information Engineering, Chinese Academy of
Sciences, China
Keynote Report
Shouhuai Xu(B)
1 Introduction
Effective cybersecurity design, operations, and management ought to rely on
quantitative metrics. This is because effective cybersecurity decision-making and
management demands cybersecurity quantification, which in turn requires us
to tackle the problem of metrics. For example, when a Chief Executive Officer
(CEO) decides whether to increase the enterprise’s cybersecurity investment, the
CEO would ask a simple question: What is the estimated return, ideally mea-
sured in dollar amount, if we increase the cybersecurity budget (say) by $5M
this year? Unfortunately, the status quo is that we cannot answer this question
yet because cybersecurity metrics and quantification remains one of the most
difficult yet fundamental open problems [10,32,38], despite significant efforts
[3,4,6–8,21,30,33,35,37,39,40,59].
Our Contributions. In this paper, we propose a systematic approach to tack-
ling the problem, by unifying Security, Agility, Resilience, and Risks (SARR) met-
rics into a single framework. The approach is assumption-driven and embraces the
c Springer Nature Switzerland AG 2021
W. Lu et al. (Eds.): SciSec 2021, LNCS 13005, pp. 3–17, 2021.
https://doi.org/10.1007/978-3-030-89137-4_1
4 S. Xu
2.1 Terminology
Abstractions and Views. Cyberspace is a complex system which mandates
the use of multiple (levels of) abstractions to understand them. We use the
term network broadly to include the entire cyberspace, an infrastructure, an
enterprise network, or a cyber-physical-human network of interest. Networks
can be decomposed horizontally or vertically, leading to two views:
SARR: A Cybersecurity Metrics and Quantification Framework 5
AssumpƟons
(threat model, trust, etc)
1. It is certain that the assumptions are not violated. This often corresponds
to the analyses that are conducted at the design phase, where designers con-
sider a range of security properties (e.g., confidentiality, integrity, availability,
authentication, and non-repudiation) with respect to a certain system model
and a certain threat model. Essentially, these security properties are often
defined over a binary scale, denoted by {0, 1}, indicating whether a property
holds or not under the system model and the threat model.
2. It is certain that some or all assumptions are violated. This often corresponds
to the operation phase, where security properties may be partially or entirely
compromised. Therefore, security properties may be defined over a continu-
ous scale, such as [0, 1] (e.g., the fraction of compromised computers in an
network). In this case, detection of violations would trigger the defender to
take countermeasures to “bounce back” from the violations, leading to the
notion of agility and resilience metrics, which will be elaborated later.
3. It is uncertain whether assumptions are violated or not (i.e., assumptions may
be violated). This naturally leads to risk metrics by associating uncertainties
to security, agility and resilience metrics.
In the rest of the section we will elaborate these matters.
2.3 Assumptions
In order to tame cybersecurity, assumptions may be made, explicitly or implic-
itly, during the design and operation phases of an network, device, component
or building-block. They are fundamental to cybersecurity properties.
Assumptions Associated with the Design Phase. At this phase, assump-
tions can be made with respect to system models, vulnerabilities, attacks (i.e.,
threat models) and defenses. For example, designers often use system models to
describe the interactions between the participating entities, the environment and
the interaction with it (if appropriate), the communication channels between the
participating entities (e.g., authenticated private channel), and the trust that is
embedded into the model (e.g., a participating entity is semi-honest or honest).
SARR: A Cybersecurity Metrics and Quantification Framework 7
Designers use threat models with simplifying assumptions when specifying secu-
rity properties, proposing systems architectures, selecting protocols and mecha-
nisms, analyzing whether a property is attained or not under those assumptions.
Programmers and testers detect/eliminate bugs and vulnerabilities in the course
of developing software, while making various (possibly implicit) assumptions
(e.g., competency of a bug/vulnerability detection tool).
Assumptions Associated with the Operation Phase. During this phase,
various kinds of (possibly implicit) assumptions are often made (e.g., compe-
tency of configurations or defense tools). One example of assumptions that are
often made at the design phase and then inherited at the operation phase is the
attacker’s capability. For example, Byzantine Fault-Tolerance (BFT) protocols,
which can be seen as a building-block, work correctly when no more than one-
third of the replicas are compromised [29]. However, there is no guarantee in the
real world that the attacker cannot go beyond the one-third threshold, effectively
compromising the assurance offered by these powerful building-blocks. This can
be further attributed to the limited capabilities of cyber defense tools, such as
intrusion detection systems and malware detectors.
Under the premise that assumptions are complete and are not violated, cyber-
security metrics may degenerate to security metrics in the sense that agility,
resilience and risk may become irrelevant. Moreover, it may be sufficient to use
binary metrics, namely {0, 1}, to quantify security properties. This serves as a
starting point towards tackling cybersecurity metrics because it would be rare
to ascertain in the real world that assumptions are certainly not violated and
that the articulated assumptions are sufficient.
Metrics Associated with the Design Phase. At the design phase, we need
to define metrics to precisely describe the desired security properties. Textbook
knowledge would teach us that the desired properties include confidentiality,
integrity, availability, authentication, non-repudiation, etc. However, they may
not be sufficient. We advocate accurate and rigorous definitions (or specifica-
tions) of metrics, ideally as accurate and rigorous as the definitions given in
modern cryptography [16]. This is important because when accurate and rigor-
ous definitions are not given, it is not possible to conduct rigorous analysis to
establish desired properties. This means that each security property must be pre-
cisely defined with respect to a system model and a threat model. For example,
when we specify an availability property, we should specify it as a property of a
service (e.g., the service offered at port #80) vs. data (e.g., a file in a computer)
in the presence of some attack.
Metrics Associated with the Operation Phase. We need to define met-
rics to precisely describe the required security properties of an network, device,
component, or building-block at the operation phase. For example, availability
metrics at the operation phase may include service response time and service
8 S. Xu
throughput. Metrics associated with the operation phase are less understood
than their counterparts associated with the design phase.
When assumptions are violated, some or all of the security properties are com-
promised. In order to describe how defenders respond to such violations of
assumptions or compromises of security properties, agility and resilience proper-
ties emerge. Intuitively, agility quantitatively characterizes how fast a defender
responds to cybersecurity situation changes [8,30], and resilience quantitatively
characterizes whether and how the defender can make the network, device, com-
ponent or building-block “bounce back” from the violation of assumptions (i.e.,
correcting the violations) and the compromise of security properties (i.e., making
them hold again). The state-of-the-art is that the notions of agility-by-design and
resilience-by-design are less investigated and understood than security-by-design.
Agility and resilience are inherently associated with the operation phase because
(i) assumptions are the starting point of a design process and (ii) assumptions are
violated in real-world operations but not at the design phase. When assumptions
are violated, we propose quantifying security, agility, and resilience properties.
For quantifying security properties, examples of metrics are described as fol-
lows. (i) To what extent may an assumption have been violated? This may
require quantifying the extent to which a network, device, component, or
building-block is compromised. This is important for example when using BFT
protocols to tolerate attacks, where the fraction of devices that are compro-
mised (e.g., 35% vs. 50%) would make a difference in the defender’s response
to the attacks. (ii) To what extent is a security property compromised? This is
important because a security property may not be all-or-nothing, meaning that
a violation of assumptions may only cause a degradation of a security property.
For example, when a network (or device) is compromised, the attacker may only
be able to steal some, but not all, of the data sorted on the network (or device),
causing a partial loss of the confidentiality property.
For quantifying agility, example metrics are described as follows. (i) How agile
is the defender in detecting the violation of an assumption? One assumption can
be that an employed intrusion prevention system can effectively detect a certain
class of attacks. Another assumption can be that the attacker does not identify
any 0-day vulnerability or use any new attack vector that cannot be recognized
by defense tools. (ii) How fast do the desired security properties degrade because
of the violation of assumptions? (iii) How quickly does the defender react to
the violation of assumptions or successful attacks? (iv) How quickly does the
defender bring the network to the required level of security properties?
For quantifying resilience, example metrics are described as follows. (i) What
is the maximum degree of violation in terms of the assumptions or security
properties that would make it possible for the defender to recover the network (or
device or component) and its services without shutting down and re-booting it
from scratch? In order to quantify these, we would need to quantify the maximum
SARR: A Cybersecurity Metrics and Quantification Framework 9
degree of violation with respect to the assumptions that can be tolerated. (ii)
Does a security property degrades gradually or abruptly when assumptions are
violated? (iii) How does the degradation pattern, such as gradual vs. abrupt,
depend on the degree of violations of the assumptions?
3 Status Quo
In this section, we use the SARR framework as a lens to look into the cyberse-
curity metrics that have been proposed in the literature. For this purpose, we
leverage survey papers [8,35,37] as a source of metrics, while considering more
recent literature published after those survey papers (e.g., [13,30]).
3.1 Assumptions
Assumptions are often articulated more clearly in building-block studies (e.g.,
cryptography) than the other settings of cybersecurity (e.g., what a chosen-
ciphertext attacker can do exactly). However, there are still gaps that are yet to
be bridged. First, assumptions may be stated implicitly. For example, cryptogra-
phy assumes that cryptographic keys are kept secret, either entirely or at least for
10 S. Xu
In [35], four classes of security metrics are defined: those for quantifying vulnera-
bilities (including user/human, interface-induced, and software vulnerabilities),
those for quantifying attack capabilities (including zero-day, targeted, botnet
attacks, malware, and evasion attacks), those for quantifying the effectiveness
of defenses (including preventive, reactive, proactive defense capabilities), and
those for quantifying situations (e.g., the percentage of compromised comput-
ers at a point in time). It is concluded in [35], and re-affirmed in [55], that the
problem “what should be measured” is largely open.
In a broader context, the existing metrics that can be adapted to measure agility
are classified into the following categories [8]: those for quantifying timeliness
(including detection time, overall agility quickness) and those for quantifying
usability (including ease of use, usefulness, defense cost).
SARR: A Cybersecurity Metrics and Quantification Framework 11
By adapting the existing metrics that are defined in other contexts, resilience
metrics may be classified into the following families [8]: those for quantifying
fault-tolerance metrics (including mean-time-to-failure, percolation threshold,
diversity), those for quantifying adaptability (including degree of local deci-
sion, degree of intelligent decision, degree of automation), and those for quan-
tifying recoverability (including mean-time-to-full-recovery, mean-time-between-
failures, mean-time-to-repair, and intrusion response cost). There are no system-
atic studies on resilience metrics.
Risk is often investigated in the setting of hazards and is often defined as a prod-
uct of threat (which is a probability estimated by domain expert or other means),
vulnerability (which is another probability estimated by domain expert or another
means), and consequence (which is the damage caused by the threat when it hap-
pens) [22]. This means that risk is quantified as the expected or mean loss. How-
ever, this approach is not competent for managing the risk incurred by terror-
ist attacks [9] because it cannot deal with, among other things, the dependence
between many events (e.g., cascading failures). This immediately implies that this
12 S. Xu
approach is not competent for cybersecurity risk management because there are
many kinds of dependencies and interdependencies which make cybersecurity risks
exhibit emergent properties [17,34,36,46]. In order to deal with these problems,
Cybersecurity Dynamics offers a promising approach, especially its predictive
power in forecasting the evolution of dynamical situational awareness attained by
first-principle analyses (e.g., [11,18,19,26–28,42,45,49–51,54,60,61]) and data-
driven analyses (e.g., [5,15,24,25,43,44,57,58]).
how the various kinds of metrics (e.g., blood pressure) would reflect a human
being’s health condition (e.g., presence or absence of certain diseases), and this
kind of knowledge is applied to guide the practice of medical diagnosis and
treatment. Analogously, cybersecurity metrics research would need to identify,
invent, and define metrics (e.g., “cybersecurity blood pressure”) that reflect the
cybersecurity situations and can be applied to diagnose the “health conditions”
of networks or devices.
In order to accelerate the fostering of a research community, we can start
with some “grass roots” actions. For example, when one publishes a paper, the
author may strive to clearly articulate the assumptions that are needed by the
new result. Moreover, the author may strive to define metrics that are impor-
tant to quantify the progress made by the new result [35]. Furthermore, when
we teach cybersecurity courses, we should strive to make students know that
much research needs to be done in order to tackle the fundamental problems
of cybersecurity metrics and quantification. For this purpose, we would need to
develop new curriculum materials.
Developing a Science of Cybersecurity Measurement. Well defined cyber-
security metrics need to be measured in the real world, which would demand the
support of principled (rather than heuristic) methods. This problem may seem
trivial at a first glance, which may be true for some metrics in some settings.
However, the accurate measurement of cybersecurity metrics could be very chal-
lenging, which may be analogous to the measurement of light speed or gravita-
tional constant in Physics. To see this, let us consider a simple and well-defined
metric: What is the fraction (or percentage) of the devices in an network that
are compromised at a given point in time t? The measurement of this metric is
challenging in practice when the network is large. The reason is that automated
or semi-automated tools (e.g., intrusion detection systems and/or anti-malware
tools) that can be leveraged for measurement purposes are not necessarily trust-
worthy because of their false-positives and false-negatives.
5 Conclusion
We have presented a framework to unify security metrics, agility metrics,
resilience metrics, and risk metrics. The framework is driven by the assump-
tions that are made at the design and operations phases, while embracing the
uncertainty about whether these assumptions are violated or not in the real
world. We identified a number of gaps that have not been discussed in the lit-
erature but must be bridged in order to tackle the problem of Cybersecurity
Metrics and Quantification and ultimately tame cybersecurity. In particular, we
must bridge the assumption gap and the uncertainty gap, which are inherent to
the discrepancies between designers’ views at lower levels of abstractions (i.e.,
building-blocks and components) and operators’ views at high levels of abstrac-
tions (i.e., networks and devices). We presented a number of future research
directions. In addition, it is interesting to investigate how to extend the SARR
framework to accommodate other kinds of metrics, such as dependability.
SARR: A Cybersecurity Metrics and Quantification Framework 15
Acknowledgement. We thank Moti Yung for illuminating discussions and Eric Ficke
for proofreading the paper. This work was supported in part by ARO Grant #W911NF-
17-1-0566, NSF Grants #2115134 and #2122631 (#1814825), and by a Grant from the
State of Colorado.
References
1. Charlton, J., Du, P., Cho, J., Xu, S.: Measuring relative accuracy of malware
detectors in the absence of ground truth. In: Proceedings of IEEE MILCOM, pp.
450–455 (2018)
2. Charlton, J., Du, P., Xu, S.: A new method for inferring ground-truth labels. In:
Proceedings of SciSec (2021)
3. Chen, H., Cho, J., Xu, S.: Quantifying the security effectiveness of firewalls and
DMZs. In: Proceedings of HoTSoS 2018, pp. 9:1–9:11 (2018)
4. Chen, H., Cho, J., Xu, S.: Quantifying the security effectiveness of network diver-
sity. In: Proceedings of HoTSoS 2018, p. 24:1 (2018)
5. Chen, Y., Huang, Z., Xu, S., Lai, Y.: Spatiotemporal patterns and predictability
of cyberattacks. PLoS ONE 10(5), e0124472 (2015)
6. Cheng, Y., Deng, J., Li, J., DeLoach, S., Singhal, A., Ou, X.: Metrics of security.
In: Cyber Defense and Situational Awareness, pp. 263–295 (2014)
7. Cho, J., Hurley, P., Xu, S.: Metrics and measurement of trustworthy systems. In:
Proceedings IEEE MILCOM (2016)
8. Cho, J., Xu, S., Hurley, P., Mackay, M., Benjamin, T., Beaumont, M.: STRAM:
measuring the trustworthiness of computer-based systems. ACM Comput. Surv.
51(6), 128:1–128:47 (2019)
9. National Research Council: Review of the Department of Homeland Security’s
Approach to Risk Analysis. The National Academies Press (2010)
10. INFOSEC Research Council. Hard problem list. http://www.infosec-research.org/
docs public/20051130-IRC-HPL-FINAL.pdf (2007)
11. Da, G., Xu, M., Xu, S.: A new approach to modeling and analyzing security of
networked systems. In: Proceedings HotSoS 2014, pp. 6:1–6:12 (2014)
12. Dai, W., Parker, P., Jin, H., Xu, S.: Enhancing data trustworthiness via assured
digital signing. IEEE TDSC 9(6), 838–851 (2012)
13. Du, P., Sun, Z., Chen, H., Cho, J.H., Xu, S.: Statistical estimation of malware
detection metrics in the absence of ground truth. IEEE T-IFS 13(12), 2965–2980
(2018)
14. Durumeric, Z., et al.: The matter of heartbleed. In: Proceedings IMC (2014)
15. Fang, Z., Xu, M., Xu, S., Hu, T.: A framework for predicting data breach risk:
leveraging dependence to cope with sparsity. IEEE T-IFS 16, 2186–2201 (2021)
16. Goldreich, O.: The Foundations of Cryptography, vol. 1. Cambridge University
Press (2001)
17. Haimes, Y.Y.: On the definition of resilience in systems. Risk Anal. 29(4), 498–501
(2009)
18. Han, Y., Lu, W., Xu, S.: Characterizing the power of moving target defense via
cyber epidemic dynamics. In: HotSoS, pp. 1–12 (2014)
19. Han, Y., Lu, W., Xu, S.: Preventive and reactive cyber defense dynamics with
ergodic time-dependent parameters is globally attractive. IEEE TNSE, accepted
for publication (2021)
20. Harrison, K., Xu, S.: Protecting cryptographic keys from memory disclosures. In:
IEEE/IFIP DSN 2007, pp. 137–143 (2007)
16 S. Xu
21. Homer, J., et al.: Aggregating vulnerability metrics in enterprise networks using
attack graphs. J. Comput. Secur. 21(4), 561–597 (2013)
22. Jensen, U.: Probabilistic risk analysis: foundations and methods. J. Am. Stat.
Assoc. 97(459), 925 (2002)
23. Kantchelian, A., et al.: Better malware ground truth: techniques for weighting
anti-virus vendor labels. In: Proceedings AISec, pp. 45–56 (2015)
24. Li, D., Li, Q., Ye, Y., Xu, S.: SoK: arms race in adversarial malware detection.
CoRR, abs/2005.11671 (2020)
25. Li, D., Li, Q., Ye, Y., Xu, S.: A framework for enhancing deep neural networks
against adversarial malware. IEEE TNSE 8(1), 736–750 (2021)
26. Li, X., Parker, P., Xu, S.: A stochastic model for quantitative security analyses of
networked systems. IEEE TDSC 8(1), 28–43 (2011)
27. Lin, Z., Lu, W., Xu, S.: Unified preventive and reactive cyber defense dynamics is
still globally convergent. IEEE/ACM ToN 27(3), 1098–1111 (2019)
28. Lu, W., Xu, S., Yi, X.: Optimizing active cyber defense dynamics. In: Proceedings
GameSec 2013, pp. 206–225 (2013)
29. Lynch, N.: Distributed Algorithms. Morgan Kaufmann (1996)
30. Mireles, J., Ficke, E., Cho, J., Hurley, P., Xu, S.: Metrics towards measuring cyber
agility. IEEE T-IFS 14(12), 3217–3232 (2019)
31. Morales, J., Xu, S., Sandhu, R.: Analyzing malware detection efficiency with mul-
tiple anti-malware programs. In: Proceedings CyberSecurity (2012)
32. Nicol, D., et al.: The science of security 5 hard problems, August 2015. http://cps-
vo.org/node/21590
33. Noel, S., Jajodia, S.: A suite of metrics for network attack graph analytics. In:
Network Security Metrics, pp. 141–176. Springer, Cham (2017). https://doi.org/
10.1007/978-3-319-66505-4 7
34. Park, J., Seager, T.P., Rao, P.S.C., Convertino, M., Linkov, I.: Integrating risk
and resilience approaches to catastrophe management in engineering systems. Risk
Anal. 33(3), 356–367 (2013)
35. Pendleton, M., Garcia-Lebron, R., Cho, J., Xu, S.: A survey on systems security
metrics. ACM Comput. Surv. 49(4), 62:1–62:35 (2016)
36. Pfleeger, S.L., Cunningham, R.K.: Why measuring security is hard. IEEE Secur.
Priv. 8(4), 46–54 (2010)
37. Ramos, A., Lazar, M., Filho, R.H., Rodrigues, J.J.P.C.: Model-based quantitative
network security metrics: a survey. IEEE Commun. Surv. Tutor. 19(4), 2704–2734
(2017)
38. National Science and Technology Council: Trustworthy cyberspace: strate-
gic plan for the federal cybersecurity research and development program
(2011). https://www.nitrd.gov/SUBCOMMITTEE/csia/Fed Cybersecurity RD
Strategic Plan 2011.pdf
39. Wang, L., Jajodia, S., Singhal, A.: Network Security Metrics. Network Security
Metrics, Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66505-4
40. Wang, L., Jajodia, S., Singhal, A., Cheng, P., Noel, S.: k-zero day safety: a network
security metric for measuring the risk of unknown vulnerabilities. IEEE TDSC
11(1), 30–44 (2014)
41. Xu, L., et al.: KCRS: a blockchain-based key compromise resilient signature system.
In: Proceedings BlockSys, pp. 226–239 (2019)
42. Xu, M., Da, G., Xu, S.: Cyber epidemic models with dependences. Internet Math.
11(1), 62–92 (2015)
43. Xu, M., Hua, L., Xu, S.: A vine copula model for predicting the effectiveness of
cyber defense early-warning. Technometrics 59(4), 508–520 (2017)
SARR: A Cybersecurity Metrics and Quantification Framework 17
44. Xu, M., Schweitzer, K.M., Bateman, R.M., Xu, S.: Modeling and predicting cyber
hacking breaches. IEEE T-IFS 13(11), 2856–2871 (2018)
45. Xu, M., Xu, S.: An extended stochastic model for quantitative security analysis of
networked systems. Internet Math. 8(3), 288–320 (2012)
46. Xu, S.: Emergent behavior in cybersecurity. In: Proceedings HotSoS, pp. 13:1–13:2
(2014)
47. Xu, S.: Cybersecurity dynamics: a foundation for the science of cybersecurity. In:
Proactive and Dynamic Network Defense, pp. 1–31 (2019)
48. Xu, S.: The cybersecurity dynamics way of thinking and landscape (invited paper).
In: ACM Workshop on Moving Target Defense (2020)
49. Xu, S., Lu, W., Xu, L.: Push- and pull-based epidemic spreading in networks:
thresholds and deeper insights. ACM TAAS 7(3), 1–26 (2012)
50. Xu, S., Lu, W., Xu, L., Zhan, Z.: Adaptive epidemic dynamics in networks: thresh-
olds and control. ACM TAAS 8(4), 1–19 (2014)
51. Xu, S., Lu, W., Zhan, Z.: A stochastic model of multivirus dynamics. IEEE Trans.
Dependable Secure Comput. 9(1), 30–45 (2012)
52. Xu, S., Yung, M.: Expecting the unexpected: towards robust credential infrastruc-
ture. In: Financial Crypto, pp. 201–221 (2009)
53. Xu, S.: Cybersecurity dynamics. In: Proceedings HotSoS 2014, pp. 14:1–14:2 (2014)
54. Shouhuai, X., Wenlian, L., Li, H.: A stochastic model of active cyber defense
dynamics. Internet Math. 11(1), 23–61 (2015)
55. Xu, S., Trivedi, K.: Report of the 2019 SATC pi meeting break-out session on
“cybersecurity metrics: Why is it so hard?” (2019)
56. Shouhuai, X., Yung, M., Wang, J.: Seeking foundations for the science of cyber
security. Inf. Syst. Front. 23, 263–267 (2021)
57. Zhan, Z., Xu, M., Xu, S.: Characterizing honeypot-captured cyber attacks: statis-
tical framework and case study. IEEE T-IFS 8(11), 1775–1789 (2013)
58. Zhan, Z., Maochao, X., Shouhuai, X.: Predicting cyber attack rates with extreme
values. IEEE T-IFS 10(8), 1666–1677 (2015)
59. Zhang, M., Wang, L., Jajodia, S., Singhal, A., Albanese, M.: Network diversity: a
security metric for evaluating the resilience of networks against zero-day attacks.
IEEE Trans. Inf. Forensics Secur. 11(5), 1071–1086 (2016)
60. Zheng, R., Lu, W., Xu, S.: Active cyber defense dynamics exhibiting rich phenom-
ena. In: Proceedings HotSoS (2015)
61. Zheng, R., Lu, W., Xu, S.: Preventive and reactive cyber defense dynamics is
globally stable. IEEE TNSE 5(2), 156–170 (2018)
Detection for Cybersecurity
Detecting Internet-Scale Surveillance
Devices Using RTSP Recessive Features
1 Introduction
transmission protocol, RTSP is the most used application protocol which can be
implemented by manufacturers as the remote video transmission service between
their video surveillance products. According to the statistics in 2019, there are
2, 290, 633 hosts were opening RTSP service on the Internet [18]. However, only
526, 241 can be identified as surveillance devices and there is a large remain-
ing number of RSTP-hosts (1, 764, 392, about 77% of the total) were tagged as
Unknown. Even the remaining unknown RSTP-hosts may include some RTSP
streaming servers or honeypots, however, the number of un-identified surveil-
lance devices is still too huge.
According to our analysis, the two main reasons of most RSTP-hosts can not
be identified are as follows: (1) Single probing load. Current mainstream search
engine (such as Shodan [13], Zoomeye [18], Censys [5]) and previous studies
[10] employed one kind method OPTIONS request as the only probing load to
obtain protocol banners. Actually, this is only one of the 11 methods which were
originally designed for RTSP request packets [12]. This caused the identification
source is unitary. (2) Simple dominant feature. Previous approaches employed
the obvious characteristic field (commonly using the name of manufacturers, such
as Hikvision, Dahua) as the fingerprinting feature. This kind of feature is simple
and intuitive, but most effective. However, only a few parts of video surveillance
devices provide these dominant keywords in their protocol banners. Moreover,
more and more manufacturers have modified their obvious vendor/model name
from the responses in their new products. This caused the scope of identification
becomes much smaller and the difficulty of fingerprinting is highly increased.
Motivation: In this paper, we aim to detect these Unknown RSTP-hosts on
the Internet whether are surveillance devices or not, and then identify their
manufacturers if they are. The key of our work is mining new effective non-
dominant features and generating accurate fingerprints on the RTSP protocol
banners. We observe that every surveillance manufacturer implements the RTSP
service between its product and a streaming server with distinction. That caused
that there are a variety of response packets when these surveillance devices
received the same sequential standard request methods (including OPTIONS,
DESCRIBE, SETUP, etc.). Therefore, these distinctions can be employed as a new
kind of recessive feature to distinguish the surveillance brands, even if the various
response packets do not contain obvious characteristic keywords.
Challenges: To achieve this end, we need to address three main challenges as
follows:
– Un-standard products: each manufacturer respectively implements RTSP in
its diverse serial of surveillance products, which caused the difficulty of feature
extraction is greatly increased.
– Too few labeled-samples: there are no publicly labeled surveillance devices by
using RTSP recessive features as ground truth for training neural networks.
– Unevenly sample distribution: unbalanced market occupancy of various
brands caused the distribution of training and testing samples are uneven.
Method: To address these challenges, we employ a novel method consists of
three parts: Text-CNN, DS 3 L, Open-world SSL, and the name is abbreviated
Detecting Internet-Scale Surveillance Devices 23
2 Related Work
Internet-Wide Discovering of Surveillance Devices. As a video surveillance
device is the most typical IoT device, previous studies on discovering surveil-
lance devices in cyberspace were also along with fingerprinting online IoT devices.
Durumeric et al. proposed ZMap which decreased Internet-wide scanning time
from two years to one hour [6]. Based on ZMap, researchers proposed many fast
Internet-wide scan mechanisms for IoT devices, including webcams [3]. At the
same time, device search engines such as Shodan [13] and Censys [5] emerged in
succession and provided Internet-wide device searching services to the public. Due
to the online surveillance devices were the most Mirai-infected bonnets in 2016 [2],
Antonakakis et al. determined hundreds of thousands of IP cameras and DVRs
were infections. A specific study on discovering almost 1.6 million surveillance
24 Z. Yan et al.
devices in cyberspace, which comes closest to our work, was given by Qiang et al.
[10]. Although Qiang et al. also discussed RTSP as one application service of video
surveillance devices, they only used HTTP webpages as the main fingerprinting
target source data.
The above-mentioned previous works mainly focus on fingerprinting surveil-
lance devices using dominant features, such as obvious vendor and product names
[5,7], or visible webpages [10]. However, these approaches only can cover a part
of target devices that carried dominant features. To the complex and irregular
cyberspace, the ideal dominant features obviously inadequate.
Protocol Recessive Features. For augmented cognition, a new field on intend-
ing to extract recessive features has come out in recent years. Xu et al. used
HTML Doomtree and CSS style as enhancing recessive features [14], that
increased 40.76% identifiable devices than using obvious vendor and product
names [7]. Kai et al. and Zhaoteng et al. proposed neural network to learn deep
recessive features in protocol banners, such as special string field [16,17]. These
works have partly added the number of identifiable online IoT devices. However,
previous works still performed helpless on RTSP-service surveillance devices.
That is exactly the issue what this paper aimed to resolve.
Hikvision Uniview
RTSP\/1.0 200 OK\r\n RTSP\/1.0 200 OK\r\n
CSeq: 1000\r\n CSeq: 1000\r\n
OPTIONS Public: OPTIONS, DESCRIBE, PLAY, PAUSE, SETUP, TEARDOWN, Public: SETUP,TEARDOWN,OPTIONS,PLAY,ANNOUNCE,DESCRIBE,
Response SET_PARAMETER, GET_PARAMETER\r\n SET_PARAMETER,\r\n\r\n
Date: Wed, May 08 2019 11:12:32 GMT\r\n\r\n
RTSP\/1.0 401 Unauthorized\r\n RTSP\/1.0 401 ClientUnAuthorized\r\n
CSeq: 1000\r\n CSeq: 1000\r\n
DESCRIBE WWW-Authenticate: Digest realm=\"54c4155fbb2a\", WWW-Authenticate: Digest realm=\"48ea631dc359\",
Response nonce=\"30e10b28c57c6c54b16963f75c6d5ce0\", stale=\"FALSE\"\r\n nonce=\"155771122371111194113114116158459961687\", stale=\"FALSE\"\r\n
WWW-Authenticate: Basic realm=\"54c4155fbb2a\"\r\n WWW-Authenticate: Basic realm=\"48ea631dc359\"\r\n\r\n
Date: Wed, May 08 2019 11:12:32 GMT\r\n\r\n
RTSP\/1.0 401 Unauthorized\r\n RTSP\/1.0 401 ClientUnAuthorized\r\n
CSeq: 1000\r\n CSeq: 1000\r\n
SETUP WWW-Authenticate: Digest realm=\"54c4155fbb2a\", WWW-Authenticate: Digest realm=\"48ea631dc359\",
Response nonce=\"12631c5955aff087d4444dc61b9229b7\", stale=\"FALSE\"\r\n nonce=\"155711112571519136121111143115834677461\", stale=\"FALSE\"\r\n
WWW-Authenticate: Basic realm=\"54c4155fbb2a\"\r\n WWW-Authenticate: Basic realm=\"48ea631dc359\"\r\n\r\n
Date: Wed, May 08 2019 11:12:32 GMT\r\n\r\n
RTSP\/1.0 454 RTSP\/1.0 401 ClientUnAuthorized\r\n
Session Not Found\r\n CSeq: 1000\r\n
PLAY CSeq: 1000\r\n WWW-Authenticate: Digest realm=\"48ea631dc359\",
Response Session: 0\r\n nonce=\"1557113249111521172121171351611601686410\", stale=\"FALSE\"\r\n
Date: Wed, May 08 2019 11:12:32 GMT\r\n\r\n WWW-Authenticate: Basic realm=\"48ea631dc359\"\r\n\r\n
RTSP\/1.0 401 ClientUnAuthorized\r\n
CSeq: 1000\r\n
PAUSE null WWW-Authenticate: Digest realm=\"48ea631dc359\",
Response nonce=\"155718132942111942811111114817589336115\", stale=\"FALSE\"\r\n
WWW-Authenticate: Basic realm=\"48ea631dc359\"\r\n\r\n
RTSP\/1.0 500 Internal Server Error\r\n RTSP\/1.0 401 ClientUnAuthorized\r\n
CSeq: 1000\r\n CSeq: 1000\r\n
TEARDOWN Session: 0\r\n WWW-Authenticate: Digest realm=\"48ea631dc359\",
Response Date: Wed, May 08 2019 11:12:32 GMT\r\n\r\n nonce=\"1557113111411129512112212119151415836419\", stale=\"FALSE\"\r\n
WWW-Authenticate: Basic realm=\"48ea631dc359\"\r\n\r\n
To identify the brand of a surveillance device besides direct brand name keywords
as observe feature, we choose three-dimensional recessive features in Response
packets. First, we use the diverse responses of 20 sequential Request methods.
Take Fig. 1 as two typical examples, two different IP cameras from two manu-
facturers return diverse responses. Among which, the contents on each header
field show obvious contrast. Second, we explore the diverse respond mechanisms
of different brands. As shown in Fig. 1, PAUSE Response of a Uniview camera
display a normal packet while Hikvision responses null. This non-responding is
a characteristic in itself to distinguish with other brands. Third, status code is
another useful feature. There are 44 kinds of RTSP status codes, which have
been implemented by diverse manufacturers. Such as “454” in Hikvision PLAY
Response and “401” in Uniview PLAY Response.
With regard to the stability of feature source data, these Responses of a
device rarely change (except for data, time, and temporary strings in the “nonce”
field) until its firmware is updated.
26 Z. Yan et al.
4 Methodology
In this section, we introduce a new architecture of TDO for fingerprinting online
surveillance devices base on mixed neural networks and deep learning methods.
As illustrated in Fig. 2, the workflow of our architecture consists of six steps:
(1) Data collection: we firstly send 20 sequential RTSP Request methods to the
hosts with port 554 opening both on our private Intranet and the public Internet.
Simultaneously, we collect these Responses from these offline and online devices.
(2) Pre-processing: we clean and normalize each Response as a matrix sample in
the unified format. (3) Input: depending on manual labeling experience and fin-
gerprints, we tag the known device a label with surveillance type and its brand.
Thus, these samples can be divided into two categories: labeled and unlabeled,
which can be respectively used as the training and testing dataset. (4) Training:
based on deep learning algorithms, we train the classification model. (5) Clas-
sification and (6) Identification: the trained module can identify the unlabeled
sample whether a surveillance device and its brand.
As illustrated in Fig. 2, our data sources are obtained from two environ-
ments: offline and online. First, concerning the offline data source, we con-
structed a private surveillance Intranet which contains 67 popular surveillance
devices we purchased. In the white-box testing environment, we can explore
<Request, Response> methods of RTSP and ensure the non-interruption in
Detecting Internet-Scale Surveillance Devices 27
4.2 Pre-processing
Although RTSP Response packets are typically semi-structured and specially
RTSP-format text content, the packets still need to be processed. As shown in
Fig. 1, the Response packets contain some useless symbols, such as “=”, “:”. To
make useful characteristics more efficient, we firstly transform these symbols as
ASCII codes and clean by them NLP (Natural Language Processing) [1]. Then,
considering the texts vary in length, we set the fixed length of each response no
exceeding 200. Most significantly, we process each response in standard RTSP
Response format with 44 fields. Among which, we fill null for the empty field.
After the above steps, each response is processed as a unified text-matrix. Thus,
each sample is constructed by 20 sequential matrixes for every host.
4.3 Labeling
The normalized samples need to be tagged a label in two previous approaches
by using common features. And the label of each host contains three attributes:
whether surveillance device (T ), whether can be identified by dominant feature
(A), and its brand name (B). For instance, a label yi = {Ai , Ti , Bi } of one
Hikvision IPcam (i) is {Y es, Y es, Hikvision}. With regard to the different class
28 Z. Yan et al.
like Hikvision NVRs, we still tagged as the same brand but trading as new unseen
class. Hence, the classification process is conducted by the three-dimensional
classifiers. Among which, the first two classifiers are binary, and the class number
(c ∈ C) of the last classifier depending on the actual number of real brands in
cyberspace. First, we use the offline dataset for artificially labeled which includes
67 private devices covering 43 brands. This manual labeling can be expended
on a part of online samples (nearly 30, 147) by similar features of 20 sequential
Responses: same status code, similar fields, and similar content. However, the
0.96% of labeled samples are still not enough for training the classier model of the
remaining most samples. For increasing the scale of labeled samples, we secondly
employ the existing fingerprinting approach [5]. By extracting dominant features
of apparent brand keywords, we manually labeled 334, 651 samples. Combined
the above two approaches and merged duplicate parts, the labeled dataset Xn
contains n (n = 341, 324) samples belonging to 43 brands (occupied 10.93% of
total samples) which can be the input of the training dataset. And the remaining
m (m = 2, 782, 165) unlabeled samples construct the testing dataset Ym .
4.4 Training
Considering the Response packet is different from common text, we employ
CNN-enhanced instead of CNN for better learning recessive features from
sequential and logic Responses. As we discussed above, we aim to use three
recessive features (status code, content, response-method), which may be pro-
cessed as ordinary word vectors based on classic CNN models. In a trial test by
using the TextCNN program in Tensorflow [9], the over-fitting problem appears
after only one round of the training process, and the learned feature focuses
on the samples of 9 big brands. Integrating the two main reasons, we propose
an enhanced-CNN model [15], directed at the extraction of all recessive char-
acteristics of the text-matrixes. Hence, we trade the 20 sequential matrixes of
each host like 20 continuous photographs, which produced the reduced global
feature map (xi ) in the input layer. In the convolutional layer, the sub-sampling
is followed to extract local features by mapping several feature maps. Then, the
two-dimension feature maps in the fully connected layer, which can be used to
linking the positional relationship of each map. Consequently, global and local
recessive features can be fully learned by the CNN-enhanced algorithm.
To address another challenge of unevenly distribution in the Internet-wide
samples, we employ DS 3 L to strengthen the minority samples of small brands
and maintain the majority samples of big brands [8]. According to our statis-
tics of the final Internet-wide experimental result, the samples of Top 5 brands
(Dahua, Hikvision, Xiongmai) occupy nearly two-thirds of the total samples
(3, 123, 489) and the remaining one-third samples are belong to more than 32
brands. For instance, the number of Bottom 10 brands (including Sony, Axis,
Netgear, etc.) surveillance devices is 317, which is less than 0.02% of Dahua (Top
1) samples. Thus, we use DS 3 L for two stages. First, we set a weight function
w(xi ; α) parameterized by α for the unlabeled samples and find the optimal
model θ̂(α) as following:
Detecting Internet-Scale Surveillance Devices 29
n
n+m
θ̂(α) = min (h(xi ; θ), yi ) + w(xi ; α)Ω(xi ; θ) (1)
θ∈Θ
i=1 i=n+1
4.5 Classification
After the training process, the trained models can be built for prediction as
three-dimensional classifiers. As mentioned in Sect. 4.3, three classifiers are log-
ically associated as follows: the first classifier is for determining whether a host
is a surveillance device, then the second classifier is for detecting whether it can
be identified by using the dominant feature, the third classifier is for identifying
its brand. We have investigated the 4 typical classification of machine learning
algorithms, including support vector machine(SVM), decision trees, and neural
networks. Considering the logic relationship of two classifiers and the actual per-
formance(see Sect. 5.2), we select the neural network for classification and three-
dimensional logistic regression. Consequently, the unlabeled sample of testing
dataset can be predicted whether is a surveillance device by the trained classifi-
cation neural network. Towards finally identifying its manufacturer, a multi-class
classifier needs to interpret.
30 Z. Yan et al.
4.6 Identification
For online experiment, we utilized the bulk data in 2020 from search engine
[13,18], which contains 5, 046, 671 active hosts via HTTP-Get requests on port 554.
Detecting Internet-Scale Surveillance Devices 31
5.2 Evaluation
Measurement. To evaluate the performance, we introduce two evaluation
indexes: precision and recall. Precision reflects the rating of devices correctly
classified, recall reflects the number of Other devices incorrectly classified, and
the harmonic means of F1-score is calculated using as follows:
TP TP 2 ∗ P recision ∗ Recall
P recision = , Recall = ,F1 =
TP + FP TP + FN P recision + Recall
(3)
32 Z. Yan et al.
where True Positive (TP) denotes the number of surveillance devices correctly
classified, False Positive denotes the number of devices incorrectly classified and
False Negative (FN) reflects the number of surveillance devices incorrectly clas-
sified. Naturally, high precision and recall are the desirable outcomes.
Performance. As shown in Fig. 4, the changing trends of precision and
F1-score were similar with the number of training samples, and the diversifica-
tion of recall is opposite. When the number of training samples reach to 150
thousands, the performance is becoming to stabilize.
Fig. 4. Trend of classification performance along with the number of training samples.
5.3 Comparison
To evaluate the further effectiveness of meeting the above-mentioned challenges,
we carry out two comparative trials on type-level and brand-level.
Comparison with Search Engine. Due to the difference in probing methods
and periods, the number of collected samples also differs from each search engine.
We choose the comparative samples in the year 2020 to ensure as fair as pos-
sible. Then, we compare the number of identified surveillance devices on type-
level. Table 3 shows that the identification rate of our approach is eight times
and three times more than those discovered by Shodan and Zoomeye respectively.
The underlying reasons are two fold: using recessive features and the deep semi-
supervised algorithm help our approach to achieve the higher classification results.
Comparison with Existing Approaches. Table 3 shows the comparison
between our approach and ARE [7]/IoTtracker [14]. Using the same dataset of
3, 123, 489 samples, The surveillance devices which can be identified with their
brands by our approach are 7.2 times and 4.7 times more than those identified by
ARE [7] and IoTtracker [14] respectively. With regard to ARE by generating finger-
prints based on dominant feature [7], our recessive features on twenty consequen-
tial Responses approach added new feature space. With regard to IoTtracker by
using recessive features on semi-structured contents similarly, our approach added
the identification results benefiting from mixed enhanced-TextCNN, DS 3 L and
open-world algorithms.
5.4 Distribution
We analyzed the distribution of identified results in two dimensions: geography
and brand.
Distribution of Countries. We locate the identified type-level results over 97
countries, with the Top10 countries accounting for 84% of 2, 803, 406 surveillance
devices. Table 4 indicates that the maximum devices (nearly one-third) belong to
China. The two main reasons are: (1) most brands of identified results (see right
part of Table 4) are manufactured by China; (2) the statistical results (including
Taiwan, Hongkong, etc.) are combined to China because these domains belong
to China.
34 Z. Yan et al.
accounting for 89.7% of all samples. Most significantly, the performance of pre-
cision and recall of our experimental results both reach up to 93%.
References
1. Nltk: the natural language toolkit. http://www.nltk.org/
2. Antonakakis, M., et al.: Understanding the mirai botnet. In: Proceedings of 26th
USENIX Security Symposium (2017)
3. Bouharb, E., Debbabi, M., Assi, C.: Cyber scanning: a comprehensive survey. In:
IEEE Communications Surveys and Tutorials. vol. 16, pp. 1496–1519 (2014)
4. Cao, K., Brbić, M., Leskovec, J.: Open-world semi-supervised learning. In:
arXiv:2102.03526 (2021)
5. Durumeric, Z., Adrian, D., Mirian, A., Bailey, M., Halderman, J.A.: A search
engine backed by internet-wide scanning. In: Proceedings of 22nd Computer and
Communications Security (2015)
6. Durumeric, Z., Wustrow, E., Halderman, J.A.: Zmap: fast internet-wide scanning
and its security applications. In: Proceedings of 23th USENIX Security Symposium
(2013)
7. Feng, X., Li, Q., Wang, H., Sun, L.: Acquisitional rule-based engine for discovering
internet-of-thing devices. In: Proceedings of 27th USENIX Security Symposium
(2018)
8. Guo, L.Z., Zhang, Z.Y., Jiang, Y., Li, Y.F., Zhou, Z.H.: Safe deep semi-supervised
learning for unseen-class unlabeled data. In: III, H.D., Singh, A. (eds.) Proceedings
of the 37th International Conference on Machine Learning Proceedings of Machine
Learning Research, vol. 119, pp. 3897–3906. PMLR (2020)
9. Kim, Y.: Convolutional neural networks for sentence classification. In: Empirical
Methods in Natural Language Processing, pp. 1746–1751 (2014)
10. Li, Q., Feng, X., Wang, H., Sun, L.: Automatically discovering surveillance devices
in the cyberspace. In: Proceedings of 8th ACM International Conference on Mul-
timedia System (2017)
11. Michael Gilleland, M.P.S.: Levenshtein distance, in three flavors. https://people.
cs.pitt.edu/ (2006)
12. Schulzrinne, H., Rao, A., Lanphier, R.: Real time streaming protocol (rtsp). In:
RFC2326 (1998)
13. Shodan: https://www.shodan.io/explore/tag/webcam
14. Wang, X., Wang, Y., Feng, X., Zhu, H., Sun, L., Zou, Y.: Iottracker: an enhanced
engine for discovering internet-of-thing devices. In: Proceedings of IEEE WoW-
MoM (2019)
15. Yan, X., Jacky, K., Bennin, K.E., Qing, M.: Improving bug localization with word
embedding and enhanced convolutional neural networks. Inf. Softw. Technol. 105,
17–29 (2019)
16. Yan, Z., Lv, S., Zhang, Y., Zhu, H., Sun, L.: Remote fingerprinting on internet-wide
printers based on neural network. In: Proceedings of IEEE GLOBECOM (2019)
17. Yang, K., Li, Q., Sun, L.: Towards automatic fingerprinting of IOT devices in the
cyberspace. Comput. Netw. 148, 318–327 (2019)
18. Zoomeye: https://www.zoomeye.org/
An Intrusion Detection Framework for
IoT Using Partial Domain Adaptation
Yulin Fan1,2(B) , Yang Li1,2 , Huajun Cui1 , Huiran Yang1 , Yan Zhang1,2 ,
and Weiping Wang1,2
1
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
[email protected]
2
School of Cyber Security, University of Chinese Academy of Sciences,
Beijing, China
1 Introduction
The Internet of things (IoT) is actively shaping the world. It combines various
sensors and end-devices with the Internet to realize the interconnection of peo-
ple, machines and things at any time and place [1]. As the application of IoT will
c Springer Nature Switzerland AG 2021
W. Lu et al. (Eds.): SciSec 2021, LNCS 13005, pp. 36–50, 2021.
https://doi.org/10.1007/978-3-030-89137-4_3
An Intrusion Detection Framework 37
involve various fields and influence all walks of life, the importance of its net-
work security is self-evident. Due to the open deployment environment, limited
resources, the inherent security loopholes of the network and the vulnerability
of IoT terminal equipment, the harm and loss of attacks will be greater than the
similar situation in traditional Internet. Therefore, the research of IoT security
technology is particularly important.
Network intrusion detection (NID) is a kind of security defense technology,
which can actively collect and analyze the network information, trying to find
out whether there is a violation of security policy. Many researches in NIDs have
made great efforts based on machine learning (ML) and deep learning (DL).
These methods capture packets from network layer and extract the character-
istics based on packets or flows to train a NID model. They do not consider
underlying protocol and can provide a good adaptive ability. At present, there
are some available public NID datasets for research, such as DARPA [4]and
KDD [3]. However, the privacy and distributed features of IoT make it difficult
to collect typical datasets of the IoT. Therefore, some NID schemes of IoT are
based on those traditional NID data rather than IoT data [8]. However, labeled
data from traditional Internet may not be suited for training DL models for
IoT. Moreover, due to heterogeneity, different IoT have different network traffic
patterns. Even the NID model trained by data collected from one IoT network
often has poor generalization performance and cannot be applied to another IoT
networks directly.
In order to solve the scarcity of labeled data and differences in data distri-
bution, domain adaptation (DA) has been used in NIDs recently [10–12]. DA is
a branch of transfer learning [13] that enable to transfer knowledge gained from
source domain with an adequate labeled data to a different but similar target
domain with few and unlabeled data [14]. In our case, the source domain refers
to a large amount of labeled NID dataset collected from traditional Internet,
while the target domain is a relative smaller unlabeled NID dataset drawn from
a specific IoT network. On one hand, our source domain and target domain are
different since they have different traffic patterns due to the different network
protocols, architectures and application modes. On the other hand, the attack
pattern in IoT is similar to the traditional Internet, such as Man-in-the-middle
Attack and Botnet. DA tries to use the similarity between data to apply the
knowledge previously learned in source domain to the new unknown domain
(target domain). Since most DA methods try to map source and target domains
to a common domain-invariant space, the label spaces of the two domains are
required to be the same for feasible transfer. However, if they have different
label spaces, the effect of DA will be greatly damaged, which is called negative
transfer. Ignoring its negative transfer effect will lead to many false positives
and even performance degradation. Therefore, the inconsistent label spaces of
source and target domain is necessary to be considered.
In reality, the source domain and target domain often have the different
label spaces, especially in our IoT research problem. For example, the tradi-
tional Internet contains various attack including Web Attack, such as XSS and
38 Y. Fan et al.
SQL Injection, DDoS, Botnet etc. Due to the large number of end-devices, the
IoT is also vulnerable to DDoS and Botnet attacks. However, many resource-
constrained IoT networks such as remote meter reading and smart parking hardly
suffers from Web Attack. So, it is reasonable to assume that the attack types in
IoT domain is a subset of that in tradition Internet domain. Here, the classes that
both the domains have constitute the shared label spaces, while the classes
not contained in the target domain but only in the source domain constitute
the private label spaces. If the knowledge containing private label spaces (like
Web Attack) and shared label spaces (like DDoS) is directly transferred to the
IoT, it may cause negative transfer. Fortunately, Partial Domain Adaptation
(PDA) [18] method has been proposed to solve the problem of inconsistent label
spaces. PDA is a DA technique that allows to find the common parts (shared
label spaces) and restrain the private label spaces of source domain that has
little relationship with target domain to improve transfer performance.
In this paper, we apply a weighted adversarial nets-based PDA [19] in NIDs
of IoT. It can transfer knowledge from one to another dataset although the label
spaces of the two domains are different. It uses adversarial network structure
to map two domains to a common domain-invariant feature space to complete
domain adaptation, and reduce weight of the samples from private label spaces
to restrain negative transfer. Our contributions are as follows:
– To the best of our knowledge, we are the first to address the problem of the
inconsistent label spaces between Internet and IoT and apply PDA method
into NIDs of IoT. We use PDA to train a highly accurate NID model with
unlabeled dataset for IoT, with the knowledge transfer of the abundant public
labeled datasets in the traditional Internet, in a specific scenario that source
and target datasets own different label spaces. To foster further research, the
source code is public1 .
– The proposed NID scheme can identify unknown attacks in IoT, with the
help of the abundant public labeled datasets in the traditional Internet.
– We design an online NID framework by offline training to detect attack auto-
matically in real-time, which is quite suitable for real IoT application.
– We implement the methods on two available NID datasets from traditional
Internet and IoT respectively. Experiment results show that, with the few
unlabeled data of IoT, the proposed PDA based NID approach can achieve a
good performance to detect attacks.
2 Background
In GAN framework [21], two models are trained simultaneously: the generation
model G and the discriminant model D. G is responsible for capturing data dis-
tribution and D is responsible for estimating the probability of samples coming
1
Our code is public available at https://github.com/rainforest2378/IoT-PDA.git.
An Intrusion Detection Framework 39
from the true distribution. G and D play with each other and self-strengthen
effect will be produced on both of them through the game.
In computer vision, G is a network that generates pictures. It receives a
random noise z and generates pictures through this noise, which is denoted as
G(z). The goal of G is to generate pictures as real as possible to deceive D. D
takes the real and fake generated images as input, and outputs the probability
that x is a real picture (comes from the distribution of real images). The closer
the output D(x) is to 1, the more likely the input image is to be real. The goal of
D is to separate images generated by G from the real image as much as possible.
In this way, G and D form a dynamic “game process”. So the minimax loss of
GAN can be denoted as:
original PCAP packets and extract flow-based statistical features to prepare the
source data and target data. Then feature extractor Fs and classifier are trained
for source domain. Second, we perform the PDA to train feature extractor Ft
of target domain and two domain discriminators (D and D0 ). Third, an online
intrusion detection model is designed to mark target samples as benign or known
attack classes, or even identify unknown attacks.
Fig. 1. The overview of three phases of our scheme, including pre-processing and pre-
training, partial domain adaptation and online intrusion detection. The blue trapezoid
Fs indicates the feature extractors of source domain. The green trapezoid Ft represents
feature extractors of target domain. The pink trapezoid D indicates domain classifier
that producing weights while the brown one D0 indicates domain classifier that identify
the source and target samples. The block filled with slashes indicates its model is fixed
and its parameters will not be updated during current phase. (Color figure online)
We start with the description of our system model. In this paper, source domain
with labeled data is denoted as Ds = {xsi , yis }ni=1
s
. xsi is a sample of source domain,
which is drawn from the Internet distribution Ps (x), where i is the index of source
samples. Target domain with unlabeled data is denoted as Dt = {xtj }nj=1 t
. xtj is a
sample in target domain which is drawn from the IoT distribution Pt (x), where
j is the index of target samples. ns and nt are respectively the number of source
samples and target samples. Our proposal focuses on datasets with sufficient
source data and limited target data, so ns > nt . Assume that feature space of
two domains are the same, that is, Xs = Xt . Assume that the attacks in IoT
are a subset of traditional Internet attacks. So, the label spaces are different
and the label space of the target domain is contained in the label space of the
source domain, that is, Ys ⊆ Yt . The edge distributions of these two domains
are different, namely Ps (x) = Pt (x). Our task is to train a NID model for target
IoT domain Dt to predict the labels y t ∈ Yt with the help of source Internet
domain Ds .
An Intrusion Detection Framework 41
As shown in left figure of Fig. 1, we first preprocess the raw network packets of
the traditional Internet and IoT network and pretrain classification model for
traditional Internet. We use the same tool to extract the same statistical features
so that the source domain and the target domain have the same feature spaces.
CICFlowMentor [22] is an open-source tool that generates bidirectional flows
(bi-flows) from PCAP files, and extracts the statistical time-related features
from these flows. In general, a unidirectional flow refers to a set of packets with
the same protocol type, source IP address, destination IP address, source port
and destination port. So, a bi-flow can be defined as a set of network packets
that move forward or back forward between two endpoints. In order to obtain
time-related statistics, bi-flows are supposed to be counted in a limited period
of time. Concretely, TCP bi-flows are terminated by the end of TCP connection
(signed by FIN packet) while UDP bi-flows are terminated by a flow timeout.
The flow timeout value can be set arbitrarily.
We also pre-train NID model C(Fs (xs )) of source domain in this phase. The
feature extractor Fs of source domain is constructed with two convolutional
layers (with 64 and 128 filters), two maximum pool layers, one flatten layer
and two full connection layer (with 128 and 64 neurons) activated by ReLU
function. The attack classifier C consists of some neurons (the number of the
neurons corresponds to number of attack classes of source samples) activated by
sigmoid function. Fs is used to extract superior features of source domain and
C is used to classify the source samples into benign or attack classes y s , such as
DDoS, Web Attack, Port Scan, Botnet, Brute Force. Note that Fs and C will be
used (fixed) in the following PDA training and online intrusion detection phases
respectively. The model C(Fs (xs )) for source domain is obtained by minimizing
the loss function and learning the parameters of Fs and C, denoted as:
This phase aims to realize knowledge transfer with partial domain adaptation.
A weighted adversarial nets-based PDA method [19] is applied to boost shared
label spaces alignment. Two domain classifier D, D0 and two feature extractors
Fs , Ft are adopted. Fs is obtained in pre-training phase. Ft share the same
network structure but different network parameters with Fs . The output of the
feature extractors will be fed into D and D0 . D and D0 are common neural
networks. D consists of 2 fully connected layers with 20, 10 neurons in that
order and D0 consists of 3 fully connected layers with 50, 40, 10 neurons in that
order.
42 Y. Fan et al.
max LD (D, Fs , Ft ) = Ex∼Ps (x) [logD(Fs (x))] + Ex∼Pt (x) [log(1 − D(Ft (x)))] (3)
D
After above training, D has converged to its optimal value. Therefore, D(z)
can indicate the likelihood of the sample coming from shared label spaces or
private label spaces of source domain. An importance weight ω(z s ) should be
assigned to each source sample to adjust its effect on transfer process. The weight
is inversely related to D(z) [19], denoted as
1
ω(z s ) = 1 − D(z s ) = Ps (z s )
(4)
Pt (z s ) +1
If the PPst (z
s
(z )
s ) is high, the sample can be perfectly discriminated from the target
IoT domain, which means it is more likely coming from the private label spaces
distribution of source Internet domain. Therefore, a smaller value w(z s ) will be
gotten. For example, the samples belong to Web Attack and Brute Force classes
in our case will be assigned smaller weights to restrain their effect on knowledge
transfer in PDA. In contrast, a small PPst (z (z s )
s ) means that samples are more likely
to come from the shared label spaces, such as, DDoS, Botnet and Scan classes in
our case. These kinds of samples are really needed because they will give positive
effect for domain adaptation. Hence, higher weights are assigned to them.
In order to distinguish weighted source features z s and target features z t and
optimize Ft , D0 is introduced as the second domain classifier. Ft and D0 play a
two-player game to align shared label spaces, that is to say, to reduce the shift
on the shared label spaces. After importance weights obtained, the objective
function can be described as following. Maximizing the loss with respect to the
parameters of D0 attempts to identify the difference between distributions of
the source and target samples,
min max Lω (D0 , Fs , Ft ) = λEx∼Ps (x) w(z s )logD0 (Fs (x)) + Ex∼Pt (x) log(1 − D0 (Ft (x))) (5)
Ft D0
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.F.
1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookball.com