SlideShare a Scribd company logo
Safe AI Systems
Requirements in Engineering AI-
Enabled Systems: Open Problems and
Opportunities
Lionel Briand, FACM, FIEEE, FRSC, MAE
http://www.lbriand.info
Affiliations & Expertise
• Canada Research Chair (Tier 1), University of Ottawa, Canada
• Director, Lero, Research Ireland centre for software research
• Software Engineering (SE)
• AI4SE, e.g., test automation (safety, security), requirements QA
• SE4AI, e.g., assurance of AI-enabled systems
2
What’s Different with AI Software?
• No source code
• No specifications
• Behaviour acquired through training based on data
• Models are never perfectly accurate (uncertainty)
• This has significant impact
3
New Types of Requirements
• Not really new, but more important in AI systems
• Fairness and discrimination: systemic, statistical …
• Uncertainty: model, system
• Explainability: what decisions or predictions does one need to
explain and for which purpose?
• Data quality: compliance with ODD (Operational Design Domain),
correctness of labels or feature values, …
• New security vulnerabilities, safety hazards
4
Uncertainty and
Explainability Example
5
Key-points Detection Testing with
Simulation in the Loop
• DNNs used for key-points detection in images
• Testing: Find test suite that causes DNN to
poorly predict as many key-points as possible
within time budget
• Evaluate safety from testing results
• Images generated by a simulator
6
Ground truth
Predicted
Ul Haq et al., 2021
Example Application
• Drowsiness or gaze detection based on interior camera monitoring the driver
• In the drowsiness or gaze detection problem, each Key-Point (KP) may be highly
important for safety
• Each KP leads to one test objective
• For our subject DNN, we have 27 test objectives
• Goal: Cause the DNN to mispredict as many key-points as possible
• Solution: Many-objective search algorithms (based on genetic algorithms) combined
with simulator
7
Overview
8
Input Generator (search) Simulator
Input (vector)
DNN
Fitness
Calculator
Actual Key-points Positions
Predicted Key-points Positions
Fitness Score
(Error Value)
Most Critical
Test Inputs
Test
Image
Safety through Explanation
• Regression trees to predict model accuracy based on simulation parameters
• Enable detailed analysis to find the root causes of high Normalized Error (NE) values, e.g., shadow on the location of KP26 is
the cause of high NE values
• Regression trees show excellent accuracy and are interpretable
• Amenable to risk analysis, gaining useful safety insights, and contingency plans at run-time
9
Image Characteristics Condition NE
𝑀 = 9 ∧ 𝑃 < 18.41 0.04
𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 𝑌 < 17.06 0.26
𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 17.06 ≤ 𝑌 < 19 0.71
𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 𝑌 ≥ 19 0.36
Representative rules derived from the decision tree for KP26
(M: Model-ID, P: Pitch, R: Roll, Y: Yaw, NE: Normalized Error)
(A) A test image satisfying
the first condition
(B) A test image satisfying
the third condition
NE = 0.013 NE = 0.89
The System must be
Designed to handle this
uncertainty (e.g., due to
shadows): Requirements,
Architecture
10
Impact: Architecture
• Guardrails, monitors checking requirements during run-time
• Ideally, such mechanisms should be automatically derived from requirements
• Security architecture: Guidance for designing security defenses
• Not just for security, but also for other AI requirements
11
AI System Security
• Security risks:
• Data integrity
• Model confidentiality
• Model Robustness
• Data privacy
12
• Security attacks:
• Evasion attacks
• Poisoning
• Backdoor
• Model extraction
Example: Evasion Attack
• Content moderation
• Adversarial example
• Produces a desired
prediction at inference
time
• Defenses?
13
Christian Kastner, “Machine Learning
in Production: From Models to
Products’, 2024
AI System Security: Architecture
• Isolation: Access control
• Detection: Monitor and
assess risks
• Failsafe: Assess inference
certainty and use fallback
mechanisms
• Redundancy: Multiple
models and voting
mechanism
14
Huawei AI Security White Paper, 2018
Safety Architecture: Example
15
Convolutional Neural
Network (CNN):
• Cross track error
(CTE)
• Heading Error (HE)
Asaadi et al., 2020
Safety Architecture: Details
16
Asaadi et al., 2020
• HiL simulator and iron
bird
• Risk assessment
components
• Contingency actions
Impact: Testing
• Test oracle
• Metamorphic relations checking requirements during testing
• Ideally, relations should be derived from requirements
17
Autonomous Driving Systems
• AI-Enabled ADSs are systems that sense their environment and navigate
autonomously. They process data from sensors (e.g., cameras, LiDAR) and
use AI-based components to make driving decisions.
18
ADS Testing
• Aim: Automate testing of ADSs in a scalable and
practical way.
• Challenges:
• Scenario space: Open context (environment)
• Test oracle: Automated detection of failures
• No (complete) specifications
• Many (safety) requirements
• Expensive simulations
19
Example Violation
• Violation: Ego Vehicle collides with vehicle in front
• Vehicle-in-front slows down suddenly and then moves to the right
• Possible reason: Model was not trained with such situations
20
Car View Top View
System Testing via Physics-based
Simulation
21
ADAS
(SUT)
Simulator (Matlab/Simulink)
Model
(Matlab/Simulink)
▪ Physical plant (vehicle / sensors / actuators)
▪ Other cars
▪ Pedestrians
▪ Environment (weather / roads / traffic signs)
Test input
Test output
time-stamped output
For a system and ODD, what
are the requirements
(control, fidelity) for a
simulator to enable testing?
22
COCOMEGA
• Goals:
• Effective search of the scenario space
• Automated failure detection without complete specifications
• Failures: Not just safety violations but subtle undesirable behaviors
• Smart combination of:
• Cooperative co-evolutionary algorithm
• Metamorphic testing
23
Yousefizadeh et al. 2024
Metamorphic Relation
• Definition: Testing is driven by differences in system behavior under varied input
transformations.
• Metamorphic relations (MR): relationships between a sequence of inputs and their respective
outputs.
• Change input i to i’ implies a predictable change in output, unless there is a failure.
• If you add a pedestrian to the field of view, the ego-vehicle should slow down.
• In metamorphic testing, the hardest part is to identify and define metamorphic relations.
24
Metamorphic Relation
• Definition: Testing is driven by differences in system behavior under varied input
transformations.
• Metamorphic relations (MR): relationships between a sequence of inputs and their respective
outputs.
• Change input i to i’ implies a predictable change in output, unless there is a failure.
• If you add a pedestrian to the field of view, the ego-vehicle should slow down.
• In metamorphic testing, the hardest part is to identify and define metamorphic relations.
25
How do we derive
metamorphic relations from
requirements?
26
Facilitating and Leveraging RE
• Templates, DSL for new types of requirements
• Supported by RE methodologies
• Architecture
• Derive rules or other checking mechanisms from
requirements to build guardrails and monitors
• Reference architectures and guidelines for AI safety,
security, bias, etc.
27
Facilitating and Leveraging RE
• Testing
• Derive metamorphic relations from requirements (e.g.,
bias) to enable effective oracles
• Compare and validate simulators (Autonomous systems)
28
Essential: How do we get
sufficient RoI from
requirements engineering?
29
Safe AI Systems
Requirements in Engineering AI-
Enabled Systems: Open Problems and
Opportunities
Lionel Briand, FACM, FIEEE, FRSC, MAE
http://www.lbriand.info
Ad

More Related Content

Similar to Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Systems Opportunities (20)

Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand
 
SSBSE 2020 keynote
SSBSE 2020 keynoteSSBSE 2020 keynote
SSBSE 2020 keynote
Shiva Nejati
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Lionel Briand
 
Applications of Machine Learning and Metaheuristic Search to Security Testing
Applications of Machine Learning and Metaheuristic Search to Security TestingApplications of Machine Learning and Metaheuristic Search to Security Testing
Applications of Machine Learning and Metaheuristic Search to Security Testing
Lionel Briand
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
Lionel Briand
 
From Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems ArchitecturesFrom Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems Architectures
Obeo
 
SBST 2019 Keynote
SBST 2019 Keynote SBST 2019 Keynote
SBST 2019 Keynote
Shiva Nejati
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
Agile Testing Alliance
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive SystemsTesting the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Lionel Briand
 
Ai in finance
Ai in financeAi in finance
Ai in finance
QuantUniversity
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
The Statistical and Applied Mathematical Sciences Institute
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuantUniversity
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Lionel Briand
 
Functional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical SystemsFunctional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical Systems
Lionel Briand
 
Measuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation DatasetsMeasuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation Datasets
michaelaupetit1
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
Lionel Briand
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
Benjamin Bengfort
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
TigerGraph
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
Miroslaw Staron
 
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
lifove
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand
 
SSBSE 2020 keynote
SSBSE 2020 keynoteSSBSE 2020 keynote
SSBSE 2020 keynote
Shiva Nejati
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Lionel Briand
 
Applications of Machine Learning and Metaheuristic Search to Security Testing
Applications of Machine Learning and Metaheuristic Search to Security TestingApplications of Machine Learning and Metaheuristic Search to Security Testing
Applications of Machine Learning and Metaheuristic Search to Security Testing
Lionel Briand
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
Lionel Briand
 
From Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems ArchitecturesFrom Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems Architectures
Obeo
 
SBST 2019 Keynote
SBST 2019 Keynote SBST 2019 Keynote
SBST 2019 Keynote
Shiva Nejati
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
Agile Testing Alliance
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive SystemsTesting the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Lionel Briand
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuantUniversity
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Lionel Briand
 
Functional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical SystemsFunctional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical Systems
Lionel Briand
 
Measuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation DatasetsMeasuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation Datasets
michaelaupetit1
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
Lionel Briand
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
Benjamin Bengfort
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
TigerGraph
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
Miroslaw Staron
 
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
lifove
 

More from Lionel Briand (20)

FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System LogsPRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
Lionel Briand
 
Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
Lionel Briand
 
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System LogsPRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
Lionel Briand
 
Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
Lionel Briand
 
Ad

Recently uploaded (20)

Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?
steaveroggers
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Ad

Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Systems Opportunities

  • 1. Safe AI Systems Requirements in Engineering AI- Enabled Systems: Open Problems and Opportunities Lionel Briand, FACM, FIEEE, FRSC, MAE http://www.lbriand.info
  • 2. Affiliations & Expertise • Canada Research Chair (Tier 1), University of Ottawa, Canada • Director, Lero, Research Ireland centre for software research • Software Engineering (SE) • AI4SE, e.g., test automation (safety, security), requirements QA • SE4AI, e.g., assurance of AI-enabled systems 2
  • 3. What’s Different with AI Software? • No source code • No specifications • Behaviour acquired through training based on data • Models are never perfectly accurate (uncertainty) • This has significant impact 3
  • 4. New Types of Requirements • Not really new, but more important in AI systems • Fairness and discrimination: systemic, statistical … • Uncertainty: model, system • Explainability: what decisions or predictions does one need to explain and for which purpose? • Data quality: compliance with ODD (Operational Design Domain), correctness of labels or feature values, … • New security vulnerabilities, safety hazards 4
  • 6. Key-points Detection Testing with Simulation in the Loop • DNNs used for key-points detection in images • Testing: Find test suite that causes DNN to poorly predict as many key-points as possible within time budget • Evaluate safety from testing results • Images generated by a simulator 6 Ground truth Predicted Ul Haq et al., 2021
  • 7. Example Application • Drowsiness or gaze detection based on interior camera monitoring the driver • In the drowsiness or gaze detection problem, each Key-Point (KP) may be highly important for safety • Each KP leads to one test objective • For our subject DNN, we have 27 test objectives • Goal: Cause the DNN to mispredict as many key-points as possible • Solution: Many-objective search algorithms (based on genetic algorithms) combined with simulator 7
  • 8. Overview 8 Input Generator (search) Simulator Input (vector) DNN Fitness Calculator Actual Key-points Positions Predicted Key-points Positions Fitness Score (Error Value) Most Critical Test Inputs Test Image
  • 9. Safety through Explanation • Regression trees to predict model accuracy based on simulation parameters • Enable detailed analysis to find the root causes of high Normalized Error (NE) values, e.g., shadow on the location of KP26 is the cause of high NE values • Regression trees show excellent accuracy and are interpretable • Amenable to risk analysis, gaining useful safety insights, and contingency plans at run-time 9 Image Characteristics Condition NE 𝑀 = 9 ∧ 𝑃 < 18.41 0.04 𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 𝑌 < 17.06 0.26 𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 17.06 ≤ 𝑌 < 19 0.71 𝑀 = 9 ∧ 𝑃 ≥ 18.41 ∧ 𝑅 < −22.31 ∧ 𝑌 ≥ 19 0.36 Representative rules derived from the decision tree for KP26 (M: Model-ID, P: Pitch, R: Roll, Y: Yaw, NE: Normalized Error) (A) A test image satisfying the first condition (B) A test image satisfying the third condition NE = 0.013 NE = 0.89
  • 10. The System must be Designed to handle this uncertainty (e.g., due to shadows): Requirements, Architecture 10
  • 11. Impact: Architecture • Guardrails, monitors checking requirements during run-time • Ideally, such mechanisms should be automatically derived from requirements • Security architecture: Guidance for designing security defenses • Not just for security, but also for other AI requirements 11
  • 12. AI System Security • Security risks: • Data integrity • Model confidentiality • Model Robustness • Data privacy 12 • Security attacks: • Evasion attacks • Poisoning • Backdoor • Model extraction
  • 13. Example: Evasion Attack • Content moderation • Adversarial example • Produces a desired prediction at inference time • Defenses? 13 Christian Kastner, “Machine Learning in Production: From Models to Products’, 2024
  • 14. AI System Security: Architecture • Isolation: Access control • Detection: Monitor and assess risks • Failsafe: Assess inference certainty and use fallback mechanisms • Redundancy: Multiple models and voting mechanism 14 Huawei AI Security White Paper, 2018
  • 15. Safety Architecture: Example 15 Convolutional Neural Network (CNN): • Cross track error (CTE) • Heading Error (HE) Asaadi et al., 2020
  • 16. Safety Architecture: Details 16 Asaadi et al., 2020 • HiL simulator and iron bird • Risk assessment components • Contingency actions
  • 17. Impact: Testing • Test oracle • Metamorphic relations checking requirements during testing • Ideally, relations should be derived from requirements 17
  • 18. Autonomous Driving Systems • AI-Enabled ADSs are systems that sense their environment and navigate autonomously. They process data from sensors (e.g., cameras, LiDAR) and use AI-based components to make driving decisions. 18
  • 19. ADS Testing • Aim: Automate testing of ADSs in a scalable and practical way. • Challenges: • Scenario space: Open context (environment) • Test oracle: Automated detection of failures • No (complete) specifications • Many (safety) requirements • Expensive simulations 19
  • 20. Example Violation • Violation: Ego Vehicle collides with vehicle in front • Vehicle-in-front slows down suddenly and then moves to the right • Possible reason: Model was not trained with such situations 20 Car View Top View
  • 21. System Testing via Physics-based Simulation 21 ADAS (SUT) Simulator (Matlab/Simulink) Model (Matlab/Simulink) ▪ Physical plant (vehicle / sensors / actuators) ▪ Other cars ▪ Pedestrians ▪ Environment (weather / roads / traffic signs) Test input Test output time-stamped output
  • 22. For a system and ODD, what are the requirements (control, fidelity) for a simulator to enable testing? 22
  • 23. COCOMEGA • Goals: • Effective search of the scenario space • Automated failure detection without complete specifications • Failures: Not just safety violations but subtle undesirable behaviors • Smart combination of: • Cooperative co-evolutionary algorithm • Metamorphic testing 23 Yousefizadeh et al. 2024
  • 24. Metamorphic Relation • Definition: Testing is driven by differences in system behavior under varied input transformations. • Metamorphic relations (MR): relationships between a sequence of inputs and their respective outputs. • Change input i to i’ implies a predictable change in output, unless there is a failure. • If you add a pedestrian to the field of view, the ego-vehicle should slow down. • In metamorphic testing, the hardest part is to identify and define metamorphic relations. 24
  • 25. Metamorphic Relation • Definition: Testing is driven by differences in system behavior under varied input transformations. • Metamorphic relations (MR): relationships between a sequence of inputs and their respective outputs. • Change input i to i’ implies a predictable change in output, unless there is a failure. • If you add a pedestrian to the field of view, the ego-vehicle should slow down. • In metamorphic testing, the hardest part is to identify and define metamorphic relations. 25
  • 26. How do we derive metamorphic relations from requirements? 26
  • 27. Facilitating and Leveraging RE • Templates, DSL for new types of requirements • Supported by RE methodologies • Architecture • Derive rules or other checking mechanisms from requirements to build guardrails and monitors • Reference architectures and guidelines for AI safety, security, bias, etc. 27
  • 28. Facilitating and Leveraging RE • Testing • Derive metamorphic relations from requirements (e.g., bias) to enable effective oracles • Compare and validate simulators (Autonomous systems) 28
  • 29. Essential: How do we get sufficient RoI from requirements engineering? 29
  • 30. Safe AI Systems Requirements in Engineering AI- Enabled Systems: Open Problems and Opportunities Lionel Briand, FACM, FIEEE, FRSC, MAE http://www.lbriand.info