Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Systems Opportunities

Safe AI Systems
Requirements in Engineering AI-
Enabled Systems: Open Problems and
Opportunities
Lionel Briand, FACM, FIEEE, FRSC, MAE
http://www.lbriand.info

Affiliations & Expertise
• Canada Research Chair (Tier 1), University of Ottawa, Canada
• Director, Lero, Research Ireland centre for software research
• Software Engineering (SE)
• AI4SE, e.g., test automation (safety, security), requirements QA
• SE4AI, e.g., assurance of AI-enabled systems
2

What’s Different with AI Software?
• No source code
• No specifications
• Behaviour acquired through training based on data
• Models are never perfectly accurate (uncertainty)
• This has significant impact
3

New Types of Requirements
• Not really new, but more important in AI systems
• Fairness and discrimination: systemic, statistical …
• Uncertainty: model, system
• Explainability: what decisions or predictions does one need to
explain and for which purpose?
• Data quality: compliance with ODD (Operational Design Domain),
correctness of labels or feature values, …
• New security vulnerabilities, safety hazards
4

Uncertainty and
Explainability Example
5

Key-points Detection Testing with
Simulation in the Loop
• DNNs used for key-points detection in images
• Testing: Find test suite that causes DNN to
poorly predict as many key-points as possible
within time budget
• Evaluate safety from testing results
• Images generated by a simulator
6
Ground truth
Predicted
Ul Haq et al., 2021

Example Application
• Drowsiness or gaze detection based on interior camera monitoring the driver
• In the drowsiness or gaze detection problem, each Key-Point (KP) may be highly
important for safety
• Each KP leads to one test objective
• For our subject DNN, we have 27 test objectives
• Goal: Cause the DNN to mispredict as many key-points as possible
• Solution: Many-objective search algorithms (based on genetic algorithms) combined
with simulator
7

Overview
8
Input Generator (search) Simulator
Input (vector)
DNN
Fitness
Calculator
Actual Key-points Positions
Predicted Key-points Positions
Fitness Score
(Error Value)
Most Critical
Test Inputs
Test
Image

Safety through Explanation
• Regression trees to predict model accuracy based on simulation parameters
• Enable detailed analysis to find the root causes of high Normalized Error (NE) values, e.g., shadow on the location of KP26 is
the cause of high NE values
• Regression trees show excellent accuracy and are interpretable
• Amenable to risk analysis, gaining useful safety insights, and contingency plans at run-time
9
Image Characteristics Condition NE
𝑀 = 9 ∧ 𝑃 < 18.41 0.04
𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 𝑌 < 17.06 0.26
𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 17.06 ≤ 𝑌 < 19 0.71
𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 𝑌 ≥ 19 0.36
Representative rules derived from the decision tree for KP26
(M: Model-ID, P: Pitch, R: Roll, Y: Yaw, NE: Normalized Error)
(A) A test image satisfying
the first condition
(B) A test image satisfying
the third condition
NE = 0.013 NE = 0.89

The System must be
Designed to handle this
uncertainty (e.g., due to
shadows): Requirements,
Architecture
10

Impact: Architecture
• Guardrails, monitors checking requirements during run-time
• Ideally, such mechanisms should be automatically derived from requirements
• Security architecture: Guidance for designing security defenses
• Not just for security, but also for other AI requirements
11

AI System Security
• Security risks:
• Data integrity
• Model confidentiality
• Model Robustness
• Data privacy
12
• Security attacks:
• Evasion attacks
• Poisoning
• Backdoor
• Model extraction

Example: Evasion Attack
• Content moderation
• Adversarial example
• Produces a desired
prediction at inference
time
• Defenses?
13
Christian Kastner, “Machine Learning
in Production: From Models to
Products’, 2024

AI System Security: Architecture
• Isolation: Access control
• Detection: Monitor and
assess risks
• Failsafe: Assess inference
certainty and use fallback
mechanisms
• Redundancy: Multiple
models and voting
mechanism
14
Huawei AI Security White Paper, 2018

Safety Architecture: Example
15
Convolutional Neural
Network (CNN):
• Cross track error
(CTE)
• Heading Error (HE)
Asaadi et al., 2020

Safety Architecture: Details
16
Asaadi et al., 2020
• HiL simulator and iron
bird
• Risk assessment
components
• Contingency actions

Impact: Testing
• Test oracle
• Metamorphic relations checking requirements during testing
• Ideally, relations should be derived from requirements
17

Autonomous Driving Systems
• AI-Enabled ADSs are systems that sense their environment and navigate
autonomously. They process data from sensors (e.g., cameras, LiDAR) and
use AI-based components to make driving decisions.
18

ADS Testing
• Aim: Automate testing of ADSs in a scalable and
practical way.
• Challenges:
• Scenario space: Open context (environment)
• Test oracle: Automated detection of failures
• No (complete) specifications
• Many (safety) requirements
• Expensive simulations
19

Example Violation
• Violation: Ego Vehicle collides with vehicle in front
• Vehicle-in-front slows down suddenly and then moves to the right
• Possible reason: Model was not trained with such situations
20
Car View Top View

System Testing via Physics-based
Simulation
21
ADAS
(SUT)
Simulator (Matlab/Simulink)
Model
(Matlab/Simulink)
▪ Physical plant (vehicle / sensors / actuators)
▪ Other cars
▪ Pedestrians
▪ Environment (weather / roads / traﬃc signs)
Test input
Test output
time-stamped output

For a system and ODD, what
are the requirements
(control, fidelity) for a
simulator to enable testing?
22

COCOMEGA
• Goals:
• Effective search of the scenario space
• Automated failure detection without complete specifications
• Failures: Not just safety violations but subtle undesirable behaviors
• Smart combination of:
• Cooperative co-evolutionary algorithm
• Metamorphic testing
23
Yousefizadeh et al. 2024

Metamorphic Relation
• Definition: Testing is driven by differences in system behavior under varied input
transformations.
• Metamorphic relations (MR): relationships between a sequence of inputs and their respective
outputs.
• Change input i to i’ implies a predictable change in output, unless there is a failure.
• If you add a pedestrian to the field of view, the ego-vehicle should slow down.
• In metamorphic testing, the hardest part is to identify and define metamorphic relations.
24

Metamorphic Relation
• Definition: Testing is driven by differences in system behavior under varied input
transformations.
• Metamorphic relations (MR): relationships between a sequence of inputs and their respective
outputs.
• Change input i to i’ implies a predictable change in output, unless there is a failure.
• If you add a pedestrian to the field of view, the ego-vehicle should slow down.
• In metamorphic testing, the hardest part is to identify and define metamorphic relations.
25

How do we derive
metamorphic relations from
requirements?
26

Facilitating and Leveraging RE
• Templates, DSL for new types of requirements
• Supported by RE methodologies
• Architecture
• Derive rules or other checking mechanisms from
requirements to build guardrails and monitors
• Reference architectures and guidelines for AI safety,
security, bias, etc.
27

Facilitating and Leveraging RE
• Testing
• Derive metamorphic relations from requirements (e.g.,
bias) to enable effective oracles
• Compare and validate simulators (Autonomous systems)
28

Essential: How do we get
sufficient RoI from
requirements engineering?
29

Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Systems Opportunities

Recommended

More Related Content

Similar to Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Systems Opportunities (20)

More from Lionel Briand (20)

Recently uploaded (20)

Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Systems Opportunities