New Attacks on Dataset, Model and Input. a Threat Model for AI

The document presents a threat model for AI-based software, emphasizing the unique vulnerabilities introduced by machine learning (ML) and artificial intelligence (AI) techniques. It includes a systematic approach to identify threats through a software development process model and an attack taxonomy derived from adversarial AI research. The authors apply this model to real-world AI software, highlighting the importance of understanding and mitigating risks associated with AI technology.

Uploaded by

emmanuel.oguchi2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

New Attacks on Dataset, Model and Input. a Threat Model for AI

Uploaded by

emmanuel.oguchi2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

ADMIn: Attacks on Dataset, Model and Input.

A Threat Model for AI

Based Software

Vimal Kumar, Juliette Mayo and Khadija Bahiss

School of Computing and Mathematical Sciences, University of Waikato, Hamilton, New Zealand
[email protected], [email protected], [email protected]

Keywords: Threat Modelling, Artificial Intelligence, Machine Learning, Taxonomy

Abstract: Machine learning (ML) and artificial intelligence (AI) techniques have now become commonplace in software
arXiv:2401.07960v1 [cs.CR] 15 Jan 2024

products and services. When threat modelling a system, it is therefore important that we consider threats
unique to ML and AI techniques, in addition to threats to our software. In this paper, we present a threat
model that can be used to systematically uncover threats to AI based software. The threat model consists of
two main parts, a model of the software development process for AI based software and an attack taxonomy
that has been developed using attacks found in adversarial AI research. We apply the threat model to two real
life AI based software and discuss the process and the threats found.

1 INTRODUCTION agement professionals and CISOs also need to be

aware of AI risks when they are assessing Informa-
While research in Machine Learning (ML) has been tion Security risks to their organisations. A cru-
actively pursued for several decades, over the last cial aspect of risk management is threat enumera-
decade or so software products and services that use tion/identification. To assess AI risks, it is imperative
Machine Learning (ML) and Artificial Intelligence to identify the threats to the AI based software prod-
(AI) have gained tremendous visibility. ML and AI ucts and services being used in an organisation.
based software products and services have become In spite of a large body of literature on adversarial
ubiquitous and seen in fields as diverse as healthcare, AI and ML, there is a lack of methods or methodolo-
finance, automotive, manufacturing, etc. ML and AI gies that practitioners can use to systematically iden-
are also extensively being used in cybersecurity for tify threats to AI based software being used in their
various tasks such as endpoint protection, malware organisation. Consequently they rely on inconsistent
detection, spam filtering, intrusion detection, authen- threat identification methods, based either on vendor
tication and fingerprinting etc. In the rest of this pa- supplied information or on random threat enumera-
per, we call software products and services that use tion. In this paper we present a methodology to map
ML and AI algorithms, AI based software. The use the existing adversarial AI threats to an AI software
of AI provides advantages of efficiency, functionality development process, thereby creating a threat model
and innovation, but it also has the potential to intro- that can be used by anyone to detect threats in their
duce vulnerabilities that are unique to AI based soft- AI software.
ware. The awareness of the risks posed by such vul- The rest of this paper is organised as following.
nerabilities has been steadily growing exemplified by We first discuss work related to threat modelling AI
the recently held AI safety summit and the Bletch- based software in section 2. In section 3 we present
ley declaration 1 at its conclusion as well as president our software development process diagram for AI
Biden’s executive order on AI2 . based software, while section 4 presents our attack
Practitioners such as security managers, risk man- taxonomy. In section 6 we show how the taxonomy is
mapped to the software development process. In sec-
1 https://www.gov.uk/government/publications/ai-
tion 7 we discuss the case studies of employing our
safety-summit-2023-the-bletchley-declaration/the- method on real world software before concluding in
bletchley-declaration-by-countries-attending-the-ai-safety-
summit-1-2-november-2023
section 8.
2 Executive order 14110: Safe, secure, and trustworthy
development and use of artificial intelligence.
2 RELATED WORK to create guidance on security development cycle for
AI based software (Marshall et al., ). This approach
The purpose of threat modelling is to identify, cat- is similar to the STRIDE-AI approach and uses the
egorise, and analyse potential vulnerabilities in ap- STRIDE categorisation of threats. Finally, OWASP
plications (product, software, data, system, service, has released the Machine Learning Security Top 10
etc. ). Threat modelling is done by analysing one or for 2023 (Singh et al., 2023), which lists the top at-
more of the attacker’s motivation, goals, techniques, tacks on AI based software. There is some overlap
and procedures in the context of the application that in the threats covered in OWASP’s list and our taxon-
is being threat modelled. It usually consists of two omy which suggests that the list could be integrated
main parts; modelling and enumeration. In mod- with our threat model in future, however that integra-
elling, the entity conducting the exercise creates a tion is outside the scope of this paper. In addition to
model of the system at hand. Common approaches the list, OWASP documentation provides guidelines
to do this include asset- and impact-centric, attack(er) on mitigation as well as metrics on risk factors.
– and threat-centric, and software- and system-centric We proffer that an attack-centric approach (as op-
approaches. (Martins et al., 2015; Selin, 2019). In posed to an asset-centric approach) to threat mod-
enumeration, this model is used to identify threats to elling provides a more straightforward way to relate
the system being studied usually aided by a taxonomy. the current adversarial AI research to software devel-
There has been substantial amount of work in ad- opment, as a taxonomy can be developed from the ex-
versarial machine learning recently, spurred on by isting literature. In the rest of this paper we present
the fundamental question on the security of machine this attack-centric approach.
learning first asked in (Barreno et al., 2006) and (Bar-
reno et al., 2010). The works described in (Worzyk
et al., 2019) (Wang et al., 2020) (Hu et al., 2022) 3 MODELLING
(Moon et al., 2022) explore techniques that attack
specific AI models and their defences, while works The first step in threat modelling is to create an ab-
such as (Wang et al., 2022) (Greshake et al., 2023) stract model of the system under investigation. Exist-
survey and summarise these techniques. (Mirsky ing threat models have used a number of approaches
et al., 2023) on the other hand conducted a large- for this. STRIDE for example uses data flow dia-
scale survey and ranking of threats to AI. Not enough grams (DFDs) to create a model and augments those
attention, however, has been paid to the develop- diagrams with trust boundaries.
ment of techniques that can be used to systemati- We chose to use the software development
cally identify threats in AI based software. One of process of AI based software for modelling. The
the first attempts to do so was from MITRE who de- software development process for AI based software
veloped MITRE ATLAS (Adversarial Threat Land- consists of three distinct phases, data processing,
scape for Artificial-Intelligence Systems) (MITRE, ). model development and deployment, as shown in
ATLAS is a knowledge-base that classifies adversary Figure 1. In the diagram presented in this paper we
tactics and techniques according to the well-known have tried to strike a balance between readability for
ATT&CK framework. The European cybersecurity cybersecurity practitioners, and the detail presented
agency ENISA also has overarching guidelines on the in the diagram. In the rest of this section we go over
risks to AI without specifying the methodology to each of the phases one by one.
identify those risks (Caroline et al., 2021). The work
most closely related to ours is STRIDE-AI (Mauri
and Damiani, 2022), which is an asset-centric ap-
proach for threat modelling AI based software. It 3.1 Data Processing
relies on conducting an FMEA (Failure Modes and
Effects Analysis) on each individual information as- The main objective of this phase is to transform or
set used in the AI based software development pro- process datasets into a format that can be used to train
cess. The information gained from the FMEA analy- a model. We divide the work undertaken in this phase
sis is then mapped to STRIDE to enumerate possible into three processes, Requirement Engineering, Data
threats. It can be argued that an asset-centric approach Cleaning and Feature Engineering & Labelling.
while useful for the developors and vendors may not Requirement Engineering: The process of re-
be the best approach for the organisations that are quirement engineering involves developers determin-
the consumers of AI based software. Microsoft has ing the specifications of the client’s AI based soft-
also utilised the FMEA approach along with bug bar ware and the requirements of needed datasets. Ac-
cording to (ur Rehman et al., 2013) there are three
Manually created,
DATA PROCESSING PHASE automated or
Raw Dataset
Specifications crowdsourced features
of Systems
and Models
& Cleaned Refined
Training 3 Training
System and Domain 1 Agreed 2
Information, Requirements Dataset Feature Dataset
Start Requirement Data Engineering
Regulations, Preparation
Stakeholder and Engineering and Labelling
Organisational
Requirements, etc.

MODEL DEVELOPMENT PHASE

New
6 Hyperparameters
Tuning Method
Optimized Hyperparameter
Trained Model Tuning and
Algorithm
Selection Validation
Dataset
7
No
Model No
Testing Dataset
Evaluation
After
*
Development 5 4 Machine
Yes Trained Model
Learning Algorithm
Is the Model Model Model with
Adequate? Evaluation Training
Model Yes Model Hyperparameters
During
Evaluation Evaluation Process
Development
Outcome After Outcome
Development During
Development
*

Data Input
DEPLOYMENT PHASE

Classification , Model
Classification
Yes 8 9 Prediction or 10 Evaluation
or Prediction
Is the Model Decision Outcome
* Adequate? Software Decision-
Model During
Evaluation Deployment
Deployment Making
No During
Procedure
Deployment
Text
*
Yes

Deployment Deployment
Environment Environment

Figure 1: Software development process for AI based software. Circles represent processes, arrows represent inputs and
outputs, diamonds represent decisions and ‘*’ means that the arrow can point to any previous process

inputs into the requirement engineering process: sys- errors, inconsistencies, omissions and duplications.
tem/domain information, stakeholder/organisational Data cleaning is a fundamental part of this process,
requirements, and regulations. Such requirements are and is often used in combination with data collection
often gathered via several different methods, such and curation (Symeonidis et al., 2022). Data prepara-
as traditional (e.g., interviews), modern (e.g., pro- tion sometimes involves other techniques such as data
totyping), cognitive (e.g., card sorting), group (e.g., transformation and is a vital step in the data process-
brainstorming) or contextual (e.g., ethnography) cat- ing phase (Kreuzberger et al., 2023).
egories. In general, the outputs of the requirement Feature Engineering & Labelling: Features are
engineering process are the agreed requirements and elements used to describe specific objects in data. The
the specifications of the systems and model being de- process of feature engineering involves creating fea-
veloped. Other, more specific details may be included tures for a dataset so it can be understood and used by
as well, such as the plans for acquiring the data, the an algorithm (Dong and Liu, 2018). The Feature En-
amount of data needed and how accurate the model gineering & Labelling process in our diagram may ad-
needs to be. ditionally encompass related techniques of feature ex-
Data Preparation: Once the specification of the traction, feature construction, feature storage and fea-
software and the requirements of the needed datasets ture encoding. There may be algorithms that do not
have been identified, work on collecting and cleaning have a feature engineering part to them. Our model,
data is usually started. (Roh et al., 2019) have di- however, is created to be exhaustive so that it covers
vided the various methods of raw data collection into most possibilities. As will be shown later, when this
three categories, discovery, augmentation and gener- diagram is used, processes that aren’t applicable to a
ation. The raw dataset thus collected can be in var- given scenario, can be removed.
ious forms, such as, audio, video, text, time-series Labelling is a related idea, often used in super-
or a combination of such formats. It may also have vised or semi-supervised learning and involves as-
errors, inconsistencies, omissions and duplications. signing a category or tag to a given piece of data
Data cleaning involves finding and resolving these (Grimmeisen et al., 2022), to help the model learn or
distinguish between objects. Model Evaluation after Development: At this
stage the model is evaluated once again. This process
3.2 Model Development takes two inputs, the optimised trained model pro-
duced after tuning and a testing dataset. The testing
The main objective of this phase is to train a model dataset is used to assess the performance of the final
and evaluate its performance. We divide the work optimised trained model. If the outcome of the eval-
undertaken in this phase into four processes, Model uation is adequate, the deployment phase is executed.
Training, Model Evaluation during Development, Hy- Otherwise, depending on the situation, the model may
perparameter Tuning and Model Evaluation after De- need to be retrained from the very beginning, or use
velopment. different training data, features, or labels.
Model Training: The refined training dataset, and
features or labels produced from the preceding pro- 3.3 Deployment
cess are used as inputs to the Model Training pro-
cess where an algorithm is trained on the data pro- In this phase the model is deployed as part of a
vided. Another input to this process is an algorithm software product or service in environments such as
or model that is to be trained. Depending on the spe- cloud, server, desktop or mobile device. The work un-
cific details of the AI model, the used algorithms will dertaken in this phase is divided into three processes,
differ. Examples of some algorithms that can be used Software Deployment, Decision Making and Model
include neural networks, ensemble learning and other Evaluation during Deployment.
supervised, semi-supervised or unsupervised learning Software Deployment: This process involves the
methods. Model training is the most critical process fully developed AI based software being deployed in
in the development of AI based software and outputs different environments. The input into this phase is
a trained model to make classifications or predictions. the data the software uses. This data is used by the
Model Evaluation during Development: In this software to output a classification or prediction, de-
process, the trained model from the preceding process pending on the problem that is being solved.
is used as an input along with a validation dataset. Decision Making: While in some cases the clas-
The validation dataset is used on the trained model sification or prediction may be the desired end-goal,
to measure model performance. This dataset can be in other cases the classification or prediction output
generated via several different methods. One method may be fed into a process, which produces a decision
is to split the training dataset into three subsets: the based on the input.
training dataset, validation dataset and testing dataset. Model Evaluation during Deployment: To en-
Other methods include k-fold cross validation, which sure that the model does not drift overtime and is
involves splitting the dataset into ‘k’ subsets. In some fit for purpose, constant, iterative evaluation or mon-
cases, multiple methods may be used. itoring of a model during deployment is sometimes
Hyperparameter Tuning: If the outcome of necessary. The Model Evaluation during Deployment
model evaluation during development is not adequate process encapsulates this thinking. If the evaluation
or the developers want to improve model perfor- outcome is adequate, the deployment phase is contin-
mance, the process of hyperparameter tuning may ued. If the evaluation outcome is not adequate, the
occur. Some examples of hyperparameters that are model may be retrained from the start, or use differ-
tuned are, the learning rate, neural network architec- ent training data, features, or labels. This evaluation
ture or the size of the neural network used (Feurer and is usually done periodically and not necessarily after
Hutter, 2019). Alternatively, developers may also go each run during deployment.
back to the data cleaning or the feature engineering &
labelling process or change the algorithm used to cre-
ate the model. In Figure 1 this is shown by a ‘*’. This 4 AI THREAT ENUMERATION
process occurs iteratively until the model is deemed
satisfactory by the developers. The second part of threat modelling is threat enumer-
Various different types of tuning methods ex-
ation. To understand the threats to AI, we explored
ist, each with their own advantages and disadvan-
extensive research literature in adversarial AI. Our lit-
tages. Examples include random search, grid search,
erature review has led to the creation of a taxonomy
or Bayesian optimisation. Meta-heuristic algorithms
of threats to AI shown in Figure 2. In our taxonomy,
such as particle swarm optimisation and genetic algo-
all possible threats to AI based software are divided
rithms are other popular tuning methods used as well
into three main categories, attacks on data, attacks on
(Yang and Shami, 2020).
model and attacks on inputs, from which we derive
our acronym ADMIn. model poisoning or logic corruption attacks, policy
exfiltration attacks and model extraction attacks.
4.1 Attacks on Data Model Poisoning or Logic Corruption Attacks:
In these attacks the adversary attempts to maliciously
In these type of attacks, the adversary’s focus is on modify or alter the logic, algorithm, code, gradients,
data. The adversary either attempts to steal propri- rules or procedures of the software. This can result
etary data through the algorithm or tries to poison or in reduction of performance and accuracy, as well as
maliciously modify internal data and/or systems. This causing the model to carry out malicious actions (Os-
category is further split into two types of attacks, data eni et al., 2021)(Benmalek et al., 2022)(Wang et al.,
exfiltration attacks and data poisoning attacks. 2022). Such attacks can be hard to defend against but
Data Exfiltration Attacks: In these attacks, the usually require that the adversary has full access to
adversary attempts to steal private information from the algorithms used in the model. This makes these
the target model’s dataset. This can take place in three attacks less likely to occur.
different ways. First, through property exfiltration at- Policy Exfiltration Attacks: In policy exfiltration
tacks, where the attack consists of an adversary steal- attacks the adversary attempts to learn the policy that
ing data properties from the training dataset. Sec- the model is enforcing by repeatedly querying it. The
ond, through dataset theft attacks, where the attacks repeated querying may make evident the input/output
involve the theft of the entire dataset. Finally, exfil- relationship and may result into the adversary learn-
tration can be achieved through datapoint verification ing the policy or rules being implemented.
attacks. In these attacks, an adversary attempts to de- Model Extraction Attacks: Also known as
termine if a specific datapoint is in the model’s train- model stealing, the adversary in these types of attacks
ing dataset via interactions with the model. steals the model to reconstruct or reverse engineer it
Data Poisoning Attacks: In these attacks, the ad- (Hu et al., 2022).This is usually done by decipher-
versary deliberately attempts to corrupt the datasets ing information such as parameters or hyperparame-
used by the AI based software. The adversary may ters. These attacks require the inputs to the model be
poison the dataset via adding new data, modifying ex- known to the adversary whereby unknown parameters
isting data (e.g., content, labels, or features), or delet- can be computed using information from a model’s
ing data in the model’s training or validation dataset, inputs and its outputs (Chakraborty et al., 2021).
with the aim of diminishing the model’s performance.
An attack consisting of addition of new datapoints 4.3 Attacks on Inputs
into the training data is performed with the intention
of adding biases to the model, so it mis-classifies in- In these type of attacks, the adversary uses malicious
puts. (Oseni et al., 2021)(Liu et al., 2022). Poisoning content as the input into a ML model during deploy-
of datasets may take place through the environment ment. This category is further split into four types of
or through the inputs to the model. Such attacks may attacks, prompt injection attacks, denial of service at-
either be targeted or untargeted. In a targeted attack tacks, evasion attacks and man-in-the-middle attacks.
an adversary may attempt to, for example have a mal- Prompt Injection Attacks: Prompt injection at-
ware classification model mis-classify the malware as tacks are a relatively new but well-known type of at-
benign. In an untargeted attack, the adversary on the tack. It consists of an adversary trying to manipulate
other hand is looking to make the model mis-classify a (natural language processing) system via prompts to
the malware as anything but the actual classification. gain unauthorized privileges, such as bypassing con-
Attacks where the adversary is looking to modify or tent filters (Greshake et al., 2023). The ChatGPT ser-
delete existing data, can be comparatively harder to vice for example responds to text prompts and may
mount as such attacks require the knowledge of and contain text filters for commercial sensitivity, privacy
access to the training data. Such access and knowl- and other reasons. However, crafting prompts in cer-
edge however can be gained by exploiting software tain ways may allow users to bypass these filters in
vulnerabilities in the systems surrounding the dataset. what is known as a prompt injection attack. Prompt
injection attacks can be harder to defend against com-
4.2 Attacks on Model pared to other well known injection attacks such as
SQL or command injection because the data input as
In these type of attacks, the adversary’s focus is on the well as the control input, both consist of natural lan-
model being used. The adversary either attempts to guage in textual prompts.
steal the proprietary model or tries to modify it. This Denial of Service (DoS) Attacks: A DoS attack
category is further split into three types of attacks, consists of an adversary disrupting the availability of
Figure 2: Taxonomy of threats to AI

a model by flooding its inputs with illegitimate re- Evasion

Attacks and
quests. DoS attacks are widely understood and in gen- Their
Subcategories
eral in such attacks the adversary can flood the model Data Poisoning
Model Poisoning or
Attacks and
with as many inputs as possible to make the soft- Their Subcategories Logic Corruption
Attacks and Their
ware unavailable to others, however an AI based soft- SPOOFING
Subcategories

ware can be susceptible to another kind of DoS attack

Man-in-the-
where the model is flooded with deliberately manipu- middle Attacks
TAMPERING
Data Exfiltration
lated inputs to cause purposeful mis-classifications or Attacks and
REPUDIATION Their Subcategories
errors (Oseni et al., 2021).
Evasion Attacks: In these attacks the adversary INFORMATION
DISCLOSURE
aims to avoid accurate classification by a model. For
Model
example, an adversary may craft spam emails in a cer- DENIAL OF
SERVICE
Extraction
Policy Attacks
tain way to avoid being detected by AI spam filters. Exfiltration
Attacks ELEVATION OF
Evasion attacks can be undertaken via methods such PRIVILEGE

as changing the model’s policy. The techniques used

Denial of
in evasion attacks are usually specific to the types of Service
Prompt
Attacks
inputs the model accepts. We have therefore sub- Injection
Attacks
classified these attacks based on the inputs; Natural
Language based attacks, image and video based at-
Attacks on the Dataset Attacks on the Inputs
tacks and attacks that modify data in the real world.
STRIDE Classification
A threat modeller can check if any category applies Attacks on the Model

to their specific AI based software by looking at the

specific type of input the model expects. Figure 3: Threats to AI classified according to STRIDE
Man in the Middle Attacks: In these attacks, the
input or output of a deployed model is intercepted
5 THREATS TO AI CLASSIFIED
and/or altered maliciously by an adversary (Moon ACCORDING TO STRIDE
et al., 2022)(Wang et al., 2020). This is usually more
likely where an AI based software is being used as a In some cases, one may want to relate each attack
service but is also possible when the software is inter- or threat to STRIDE if additional information is
nally deployed as a product. As an example, an adver- required. STRIDE is a common threat model used
sary can intercept and alter system and network data for software threats and as mentioned in section 2,
being input to a model to produce mis-classification threat models from Microsoft and STRIDE-AI use
that ultimately results in defence evasion. STRIDE. In this section we explain how the threats
in our taxonomy are related to STRIDE. This is
pictorially shown in Figure 3.
Spoofing • Add any additional inputs, outputs, or processes
• In evasion attacks, the adversary modifies data in that are/were used, but are not displayed in the
the real world or uses other tactics such as modify- original diagram
ing images to avoid detection from a model. This • Sequentially for each process and its inputs and
is a classic spoofing threat. outputs, add the attacks that apply to your soft-
• In model poisoning, or data poisoning attacks, an ware by referring to the threats in the attack tax-
adversary may alter code or inputs to a model onomy shown in 2
to avoid detection by creating purposeful mis- • If additional information is required, relate the ad-
classifications. versarial attacks for each input, process, and out-
Tampering put on the ML diagram to STRIDE
• In model poisoning, or data poisoning attacks, the
main goal of an adversary is to maliciously mod-
ify a model’s code or data. This is clearly a tam- 7 CASE STUDIES
pering threat.
We used our threat modelling approach on two real
• In man-in-the-middle attacks, an adversary may world AI based software and went through the process
tamper with data that is being sent to and from the with their developers to evaluate the effectiveness of
model. our approach. In this section, we go over the results
Repudiation of the threat modelling exercises.
• As evasion attacks are considered spoofing at-
tacks, repudiation exists. This is because an ad- 7.1 Case Study 1
versary can deny carrying out any malicious ac-
tions if robust audit logs are missing. A threat modelling case study was undertaken with
Information Disclosure the app and website, ‘Aotearoa Species Classifier’ 4 .
This software identifies animal and plant species from
• In data, model or policy exfiltration attacks, the all around New Zealand from a single photo. We
main goal of the adversary is to steal the model, went through the 5 step threat modelling process as
it’s properties or dataset, which discloses confi- described in section 6 with the developers.
dential information. Step 1.We used the software development process
Denial of Service diagram from Figure 1.
Step 2.The application used a convolutional neu-
• In DoS attacks, the adversary’s main goal is to
ral network so the ‘Feature Engineering & Labelling’
deny the software’s service to users.
process was removed from the diagram. The ‘Model
Elevation of Privilege Evaluation During Deployment’ process was also re-
• In prompt injection attacks, the adversary’s main moved as it was not used.
goal is to gain unauthorized privileges or access Step 3. No additional inputs, outputs, processes,
via certain input prompts. or arrows were added to the diagram.
Step 4.The attack taxonomy was applied to the
software development process one by one. While the
exercise took place by mapping the taxonomy to each
6 THREAT MODELLING process, for brevity we describe the results in Table 1
PROCESS in terms of attacks on dataset, model, and inputs.
Step 5.This step was not undertaken as additional
AI based software can be threat modelled by map- STRIDE information was not required.
ping the threats in the previously explained taxonomy
to the processes in the software development process 7.2 Case Study 2
discussed in section 3, as described below.
A threat modelling case study was undertaken with
• Obtain the software development process diagram
an image-based AI software. This software is used
for AI based software as shown in Figure 1
to improve roads and automatically detect roading
• Remove any inputs, outputs, or processes that
are/were not used in the software. 4 https://play.google.com/store/apps/details?id=com.
waikatolink.wit app
Table 1: Comparing the possible threats to the AI based software in the case studies
Attack on Case study 1 Case Study 2
Dataset The application was not susceptible to any of The training data for the application was pri-
the exfiltration attacks. The software and all its vate and stored on cloud. It was discovered
data was open source, aspects such as the algo- that the data was susceptible to exfiltration via
rithm, training data or hyperparameters cannot dataset theft. The data was collected from cus-
be stolen as these were already publicly avail- tomers but it couldn’t be said if the data was
able. completely trustworthy. Therefore data poi-
It was however possible for the dataset to be soning was a valid threat.
poisoned. Alhough the dataset is derived from
iNaturalist3 , which is generally considered a
trusted source, it is possible that an adversary
can compromise the data repositories to poison
the data.
Model It was possible for an adversary to poison the It was possible for an adversary to poison the
model if the adversary had access to some model if the adversary has access to the some
parts of the AI based software development parts of the AI based software development
process. Policy Exfiltration and Model Extrac- process. Policy Exfiltration and Model Extrac-
tion attacks were however not relevant due to tion attacks were however not relevant due to
the software being open source. the software being open source.
Input As the inputs into this model were images, at- As the software is only available to known
tacks that involve modifying the model’s en- clients, DoS attacks were not considered to be
vironment and image-based evasion attacks a threat. Although man-in-the-middle attacks
could occur. The web-based version of the ap- were a possibility, the developers trusted the
plication was found to be vulnerable to DoS cloud provider’s security enough to not con-
attacks and man-in-the-middle attacks on both sider it a relevant threat. Prompt injection at-
the inputs and outputs. The application did not tacks were not relevant as the model only used
take a prompt as input, therefore, it was not images. However, image-based evasion at-
susceptible to prompt injection attacks tacks were considered relevant.

problems such as potholes. We went through the 5 8 CONCLUSION

step threat modelling process as described in section
6 with the developers. A large number of software products and services
Step 1.We used the software development process these days claim to utilize AI, and cybersecurity prac-
diagram from Figure 1. titioners are expected to manage cybersecurity risks
Step 2.Since the application development process posed by such software. In this paper we have pre-
included data labelling but not feature engineering, sented a systematic approach to identify the threats to
inputs related to feature engineering were omitted AI based software. Our threat model entitled ‘AD-
from the diagram. The process of ‘Model Evaluation MIn’ is an attack-centric model, that categorises ad-
during Deployment’ was removed as it was not used. versarial AI attacks into three categories. These at-
The yes arrow pointing to ‘*’ at the ‘Is the Model Ad- tacks are mapped to the software development process
equate?’ decision was removed, as it was never im- for AI based software, to ascertain the threats that are
plemented during the software deployment process. applicable to the software under investigation.
Step 3.No additional inputs, outputs, processes, or Both AI and Cybersecurity are fields that are see-
arrows were added to the diagram. ing rapid development and there is increasing aware-
Step 4.The attack taxonomy was applied to the ness of a need for threat modelling AI based software.
software development process one by one. While the In future, we would investigate the integration of the
exercise took place by mapping the taxonomy to each ADMIn threat model with OWASP ML Top 10 and
process, for brevity we describe the results in Table 1 MITRE ATLAS. We would also like to build upon
in terms of attacks on dataset, model, and inputs. ADMIn to create a Risk Assessment and Management
Step 5.This step was not undertaken as additional methodology for AI based software.
STRIDE information was not required.
ACKNOWLEDGEMENTS Martins, G., Bhatia, S., Koutsoukos, X., Stouffer, K., Tang,
C., and Candell, R. (2015). Towards a systematic
The authors would like to acknowledge funding from threat modeling approach for cyber-physical systems.
In 2015 Resilience Week (RWS), pages 1–6. IEEE.
the New Zealand Ministry of Business, Innovation
and Employment (MBIE) for project UOWX1911, Mauri, L. and Damiani, E. (2022). Modeling threats to ai-
ml systems using stride. Sensors, 22(17).
Artificial Intelligence for Human-Centric Security.
Mirsky, Y., Demontis, A., Kotak, J., Shankar, R., Gelei, D.,
Yang, L., Zhang, X., Pintor, M., Lee, W., Elovici, Y.,
and Biggio, B. (2023). The threat of offensive ai to
REFERENCES organizations. Computers & Security, 124:103006.
MITRE. Mitre atlas, adversarial threat landscape for
Barreno, M., Nelson, B., Joseph, A. D., and Tygar, J. D. artificial-intelligence systems. https://atlas.mitre.org.
(2010). The security of machine learning. Machine Accessed: 2023-10-26.
Learning, 81:121–148. Moon, S., An, G., and Song, H. O. (2022). Preemptive im-
Barreno, M., Nelson, B., Sears, R., Joseph, A. D., and Ty- age robustification for protecting users against man-
gar, J. D. (2006). Can machine learning be secure? In in-the-middle adversarial attacks. In Proceedings of
Proceedings of the 2006 ACM Symposium on Informa- the AAAI Conference on Artificial Intelligence, vol-
tion, computer and communications security, pages ume 36, pages 7823–7830.
16–25. Oseni, A., Moustafa, N., Janicke, H., Liu, P., Tari, Z., and
Benmalek, M., Benrekia, M. A., and Challal, Y. (2022). Vasilakos, A. (2021). Security and privacy for artifi-
Security of federated learning: Attacks, defensive cial intelligence: Opportunities and challenges. arXiv
mechanisms, and challenges. Revue des Sciences preprint arXiv:2102.04661.
et Technologies de l’Information-Série RIA: Revue Roh, Y., Heo, G., and Whang, S. E. (2019). A survey on
d’Intelligence Artificielle, 36(1):49–59. data collection for machine learning: a big data-ai in-
Caroline, B., Christian, B., Stephan, B., Luis, B., Giuseppe, tegration perspective. IEEE Transactions on Knowl-
D., Damiani, E., Sven, H., Caroline, L., Jochen, M., edge and Data Engineering, 33(4):1328–1347.
Nguyen, D. C., et al. (2021). Securing machine learn- Selin, J. (2019). Evaluation of threat modeling methodolo-
ing algorithms. gies.
Chakraborty, A., Alam, M., Dey, V., Chattopadhyay, A., Singh, S., Bhure, S., and van der Veer, R. (2023). Owasp
and Mukhopadhyay, D. (2021). A survey on adversar- machine learning security top 10 - draft release v0.3.
ial attacks and defences. CAAI Transactions on Intel- Symeonidis, G., Nerantzis, E., Kazakis, A., and Papakostas,
ligence Technology, 6(1):25–45. G. A. (2022). Mlops-definitions, tools and challenges.
Dong, G. and Liu, H. (2018). Feature engineering for ma- In 2022 IEEE 12th Annual Computing and Commu-
chine learning and data analytics. CRC press. nication Workshop and Conference (CCWC), pages
Feurer, M. and Hutter, F. (2019). Hyperparameter optimiza- 0453–0460. IEEE.
tion. Automated machine learning: Methods, systems, ur Rehman, T., Khan, M. N. A., and Riaz, N.
challenges, pages 3–33. (2013). Analysis of requirement engineering pro-
Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, cesses, tools/techniques and methodologies. Interna-
T., and Fritz, M. (2023). More than you’ve asked for: tional Journal of Information Technology and Com-
A comprehensive analysis of novel prompt injection puter Science (IJITCS), 5(3):40.
threats to application-integrated large language mod- Wang, D., Li, C., Wen, S., Nepal, S., and Xiang, Y. (2020).
els. arXiv preprint arXiv:2302.12173. Man-in-the-middle attacks against machine learning
Grimmeisen, B., Chegini, M., and Theissler, A. (2022). Vis- classifiers via malicious generative models. IEEE
gil: machine learning-based visual guidance for inter- Transactions on Dependable and Secure Computing,
active labeling. The Visual Computer, pages 1–23. 18(5):2074–2087.
Hu, H., Salcic, Z., Sun, L., Dobbie, G., Yu, P. S., and Zhang, Wang, Z., Ma, J., Wang, X., Hu, J., Qin, Z., and Ren, K.
X. (2022). Membership inference attacks on machine (2022). Threats to training: A survey of poisoning at-
learning: A survey. ACM Computing Surveys (CSUR), tacks and defenses on machine learning systems. ACM
54(11s):1–37. Computing Surveys, 55(7):1–36.
Kreuzberger, D., Kühl, N., and Hirschl, S. (2023). Ma- Worzyk, N., Kahlen, H., and Kramer, O. (2019). Phys-
chine learning operations (mlops): Overview, defini- ical adversarial attacks by projecting perturbations.
tion, and architecture. IEEE Access. In 28th International Conference on Artificial Neu-
ral Networks, Munich, Germany, pages 649–659.
Liu, P., Xu, X., and Wang, W. (2022). Threats, attacks and
Springer.
defenses to federated learning: issues, taxonomy and
perspectives. Cybersecurity, 5(1):1–19. Yang, L. and Shami, A. (2020). On hyperparameter opti-
mization of machine learning algorithms: Theory and
Marshall, A., Parikh, J., Kiciman, E., and Kumar, R. S. S.
practice. Neurocomputing, 415:295–316.
Ai/ml pivots to the security development lifecycle
bug bar. https://learn.microsoft.com/en-us/security/
engineering/bug-bar-aiml. Accessed: 2023-10-26.