SlideShare a Scribd company logo
Revisiting the Notion of Diversity
in Software Testing
Lionel Briand
SBFT 2023 Keynote
http://www.lbriand.info
Why Diversity?
• Diverse test cases
• Exercising the system to the largest extent possible within a
budget
• Increase probability of fault detection
• While working with incomplete knowledge
• Cost of acquiring information
• Missing information
2
Example: Fuzzing with AFL
3
Diversity mechanisms: Mutation, coverage
Credits: Antonio Morales, https://github.com/antonio-morales/Fuzzing101
Aspects of Diversity
4
SUT
Inputs Outputs
Execution (internal):
- Structural coverage
- Model coverage (e.g., states)
Questions
• What aspects of diversity to focus on?
• Information access
• Information cost, e.g., execution time
• Context-dependent
• How to measure diversity?
• Representation (e.g., inputs)
• Distance measure, e.g., cosine, edit
• Computational cost
• Guidance, e.g., in search
• How to maximize diversity?
• Mutation, metaheuristic search, symbolic execution …
• Issues: cost, scalability, bias, effectiveness
5
Aspects of Diversity
• Inputs: No instrumentation, does not require the execution of
the SUT
• Outputs: No instrumentation, execution required but directly
characterizes the behavior of the SUT
• Internal SUT structure: Instrumentation, possibly modeling,
additional execution cost and significant data storage
6
Example: Testing DNNs
• Redundant or invalid inputs
• Labeling cost is high
• Domain-specific knowledge is
required to manually label test
inputs
• Cost of test execution can be
high
• Coverage ineffective
• Test selection based on inputs
7
Aghababaeyan et al., 2023
Example: Testing DNNs
We want to test a DNN model with a fixed test budget.
• How can we automatically select a candidate test subset with high-fault
revealing power to test DNNs?
• Black-box test selection based on input diversity.
8
Black-box test
selection method
Test inputs T Subset
S⊆T
Example: Testing DNNs
• No model execution
• No access to model internals or training set
• Studies show that proposed coverage measures for DNNs
not associated with faults
• Solution: Geometric diversity of image features
9
Extracting Image Features
• VGG16 is a convolutional neural network trained on a
subset of the ImageNet dataset, a collection of over 14
million images belonging to 22,000 categories.
10
Features:
- Activation values
after last convolutional
layer
- Characterize semantic
elements such as shapes
and colors
Geometric Diversity (GD)
• Given a dataset X and its corresponding feature vectors V,
the geometric diversity of a subset S ⊆ X is defined as the
hyper-volume of the parallelepiped spanned by the rows of
Vs, i.e., feature vectors of items in S, where the larger the
volume, the more diverse is the feature space of S
11
Aghababaeyan et al., 2023
Measuring Diversity
• Representation and measure: Construct validity?
• Cost of computing diversity
• Guidance provided by diversity, e.g., test selection search
12
Example: Test Minimization
• Permanently remove redundant test cases in a test suite that are
unlikely to detect new faults
• Black-box versus white-box techniques
• FAST-R: Quick and black-box, but low fault detection rates
• ATM: Abstract Syntax Tree (AST)-based Test case Minimizer
• Motivation: Achieve a better trade-off between effectiveness and
efficiency than FAST-R
• Context: Minimization only applied to major releases
13
Example: ATM
• Representation: AST of pre-processed test code
• Tree similarity measures: top-down, bottom-up, combined, edit distance
• Common subtree isomorphism algorithms
• Top-down and bottom-up emphasize different aspects of similarity
between ASTs
14
Transform test code
to ASTs
Test Suite
Measure test
case similarity
Run search
algorithms
Minimized test
suite
Pre-process test
code
4 tree-based similarity
measures
GA & NSGA-II
Pan et al., 2023
Example: ATM
• Alternatives evaluated in terms of Fault Detection Rate (FDR)
• Edit distance is expensive but offers good guidance
• Combined similarity not significantly different
• Multi-objective search more expensive
• Much higher fault detection than FAST-R in significantly higher execution
time, though practical up to an extent
15
GA NSGA-II
Top-Down Bottom-Up Combined Tree Edit Distance Top-Down & Bottom-Up Combined & Tree Edit
Distance
FDR
0.78
Time
70.87
FDR
0.74
Time
67.05
FDR
0.80
Time
72.75
FDR
0.81
Time
82.23
FDR
0.78
Time
235.41
FDR
0.82
Time
258.44
Example: Input Diversity in DNNs
• Alternative diversity measures: Geometrics Diversity,
Normalized Compression Distance (NCD), standard deviation
• Construct validity?
• Analysis:
• We study how diversity scores change while varying the number of classes or
concepts inside the images of the input sets.
• We assume that diversity scores should increase with the number of classes or
concepts that are present in an input set.
16
Example: Input Diversity in DNNs
• Geometrics Diversity shows a clear monotonic relationship
with the number of classes in the input set
17
11
(a) Evolution of GD on Cifar-10 (b) Evolution of STD on Cifar-10 (c) Evolution of NCD on Cifar-10
(d) Evolution of GD on MNIST (e) Evolution of STD on MNIST (f) Evolution of NCD on MNIST
Figure 8: Evolution of the diversity scores for input sets from Cifar-10 and MNIST. Each boxplot shows the distribution of
diversity scores of 20 input sets of size 100.
This article has been accepted for publication in IEEE Transactions on Software Engineering. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TSE.2023.3243522
Aghababaeyan et al., 2023
Applications
• Test minimization, selection, prioritization
• Mutation analysis
• Identify boundaries in the input space, e.g., safe vs unsafe
18
MASS: CPS Mutation Testing
19
Create mutants Compile mutants
Killed Mutants
Live Mutants
2
Collect test data
1
Code
Coverage
Remove equivalent/duplicate
based on compiler optimizations
4
3
Mutants
Code coverage
Mutants successfully
compiled
Unique mutants
Evaluate mutation
score’s confidence
Sampled mutants
Sample mutants
Execute prioritized
subset of test cases
5
6 7
Cornejo et al., 2021
• Selection and prioritization of test
cases based on statement coverage
• Test suite prioritization:
• Greedy algorithm
• Select first the test case that
largely differ from the most
similar, already selected,
test case
• Test suite reduction: exclude test
cases with perfect similarity
MASS: CPS Mutation Testing
• Compare the sets of source code
statements that have been
covered by test cases: Jaccard
and Ochiai
• Compare the number of times
each statement has been covered
by test cases: Euclidian, cosine
• Focus on functions in source file
where mutated statement located
• Best: Cosine distance
• Difference in mutation score < 5%
20
Create mutants Compile mutants
Killed Mutants
Live Mutants
2
Collect test data
1
Code
Coverage
Remove equivalent/duplicate
based on compiler optimizations
4
3
Mutants
Code coverage
Mutants successfully
compiled
Unique mutants
Evaluate mutation
score’s confidence
Sampled mutants
Sample mutants
Execute prioritized
subset of test cases
5
6 7
Reduction in mutation
analysis time > 70%
Explanations for DNN Errors (SEDE)
Can we explain DNN failures of real-world images
using simulator parameters?
21
Training Set
Simulator
Images
DNN
Training
Test Set
Simulator
Images
DNN
Testing
DNN
Training
(fine-tuning)
Training Set
Real-world
Images
DNN
Testing
Trained
DNN
Fine-Tuned
DNN
Real-world Error
Inducing Images
Test Set
Real-world
Images
SEDE
22
Real-world
Error-inducing images HUDD
Evolutionary
Algorithms
Simulator
Simulator
images
Configuration
Parameters
RCC Prototype Images
Step 1. Identify root-cause clusters (RCCs)
Step 2. Generate images associated to RCCs
RCCs
Step 2.1. Identify RCC Prototype Images
Step 2.2. Generate a set of unsafe images belonging to the cluster
Step 2.3. Generate one safe image for each unsafe image
PaiR
Error-inducing
Test Set images
Step1.
Heatmap
based
clustering
Root cause clusters
C1 C2 C3 Step 2. Inspection of subset
of cluster elements.
HUDD: Fahmy et al. 2021
Cluster 2
(near closed eyes)
incomplete training set
Cluster 1
(angle ~157.5)
borderline cases
SEDE
23
Synthetic
images
Parameters-based Description
Improved
DNN
model
Retraining: +18.6%
Real-World
Images
HeadPosex > 10
& HeadPosey > 50.34
Real-world images
Diverse simulator images, within the cluster
Diverse failing simulator images, close to these images
Passing simulator images, close to failing ones in cluster
S1
S2
S3
Cluster
Process: Simulator-based Explanations for DNN Errors (SEDE)
Real-world
Error-inducing images HUDD
Evolutionary
Algorithms
Simulator
Simulator
images
Configuration
Parameters
RCC Prototype Images
Step 1. Identify root-cause clusters (RCCs)
Step 2. Generate images associated to RCCs
RCCs
Step 2.1. Identify RCC Prototype Images
Step 2.2. Generate a set of unsafe images belonging to the cluster
Step 2.3. Generate one safe image for each unsafe image
PaiR
Fahmy et al. 2022
WCET for Critical Tasks
24
• Real-time systems
• Schedulability analysis verifies time constraints for critical
tasks
• Early schedulability analysis and design decisions require
early task Worst Case Execution Time (WCET) estimates
• Challenges: Tasks not fully implemented, worst case
inputs unknown
• Goal: Estimating Probabilistic Safe WCET Ranges at Design
Stages
SAFE: WCET boundaries
• Safe WCET boundaries: implementation objectives, evaluate design options
• Iterative, distance-based sampling of WCET values within ranges
25
Phase 1. Worst-case task arrivals analysis Phase 2. Safe WCET computation
Training dataset
Worst-case
sequences of
task arrivals
Task
descriptions
Search Learning
Safe
Unsafe
WCET T1
WCET T2
SAFE: Safe WCET Analysis method For real-time
task schEdulability (Lee et al., 2022)
Use of Language Models
• Code (e.g., test) or trace vector representation (encoding)
• Benefit from pre-trained language models, e.g., CodeBERT
• Example: test case prioritization (Test2Vec, Jabbar et al., 2022), minimization
• Embedding test execution traces with fine-tuned CodeBERT
• Fined-tuned with pass/fail labels for past test cases in a system
• Test2Vec maps test execution traces, i.e., sequences of method calls with their
inputs and return values, to fixed-length, numerical vectors
• Heuristic: Similarity to previous failing test cases in the same project
26
Test2Vec Architecture
Preprocessing
(abstraction)
Embedding
Prediction
(prioritization)
27
Conclusions
• Many applications of diversity in testing
• Various aspects warrant different solutions: information access,
execution cost, instrumentation cost
• Trade-off between representations (test cases) and
distance/similarity measures: computation cost, guidance
• Determining the best solution can only be done empirically, in a
well-defined (application) context
• Check assumptions and properties of distance/similarity
measures, e.g., desired sensitivity to change
• Scalability is usually the stumbling block for many applications
28
Selected References
• Cornejo et al., “Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results in the Space Domain”, IEEE Transactions on
Software Engineering, 2021
• Fahmy et al. "Supporting DNN Safety Analysis and Retraining through Heatmap-based Unsupervised Learning" IEEE Transactions on
Reliability, Special section on Quality Assurance of Machine Learning Systems, 2021
• Fahmy et al. "Simulator-based explanation and debugging of hazard-triggering events in DNN-based safety-critical systems”, ACM TOSEM,
2022
• Attaoui et al., “Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction and Clustering”, ACM TOSEM, 2022
• Pan et al., , “ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolutionary Search”, IEEE /ACM ICSE 2023,
• Lee et al., “Estimating probabilistic safe wcet ranges of real-time systems at design stages”, ACM TOSEM 2022
• Jabbar et al., Test2Vec: An Execution Trace Embedding for Test Case Prioritization, ArXIV, 2022
• Aghababaeyan et al., “Black-Box Testing of Deep Neural Networks through Test Case Diversity”, IEEE Transactions on Software
Engineering, 2023
• Aghababaeyan et al., “DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks”,
https://arxiv.org/abs/2303.04878
29
Looking for Postdocs!
Lionel Briand
SBFT 2023 Keynote
http://www.lbriand.info
Ad

More Related Content

What's hot (20)

Acceptance Test Driven Development
Acceptance Test Driven DevelopmentAcceptance Test Driven Development
Acceptance Test Driven Development
Mike Douglas
 
Capabilities for Resources and Effects
Capabilities for Resources and EffectsCapabilities for Resources and Effects
Capabilities for Resources and Effects
Martin Odersky
 
Unit Testing And Mocking
Unit Testing And MockingUnit Testing And Mocking
Unit Testing And Mocking
Joe Wilson
 
Kotlin for Android Development
Kotlin for Android DevelopmentKotlin for Android Development
Kotlin for Android Development
Speck&Tech
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
API Test Automation Using Karate (Anil Kumar Moka)
API Test Automation Using Karate (Anil Kumar Moka)API Test Automation Using Karate (Anil Kumar Moka)
API Test Automation Using Karate (Anil Kumar Moka)
Peter Thomas
 
Implementação de Redes com Alta Disponibilidade
Implementação de Redes com Alta DisponibilidadeImplementação de Redes com Alta Disponibilidade
Implementação de Redes com Alta Disponibilidade
tiredes
 
CNIT 126 12: Covert Malware Launching
CNIT 126 12: Covert Malware LaunchingCNIT 126 12: Covert Malware Launching
CNIT 126 12: Covert Malware Launching
Sam Bowne
 
2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured
Eing Ong
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Lionel Briand
 
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Databricks
 
Arquitetura básica de testes para seu projeto Java
Arquitetura básica de testes para seu projeto JavaArquitetura básica de testes para seu projeto Java
Arquitetura básica de testes para seu projeto Java
Elias Nogueira
 
AI assisted testing using postman and openAI.pdf
AI assisted testing using postman and openAI.pdfAI assisted testing using postman and openAI.pdf
AI assisted testing using postman and openAI.pdf
sivaganeshsivakumar1
 
An Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
An Introduction to JUnit 5 and how to use it with Spring boot tests and MockitoAn Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
An Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
shaunthomas999
 
Reversing Google Protobuf protocol
Reversing Google Protobuf protocolReversing Google Protobuf protocol
Reversing Google Protobuf protocol
n|u - The Open Security Community
 
JUnit 5 vs JUnit 4
JUnit 5 vs JUnit 4JUnit 5 vs JUnit 4
JUnit 5 vs JUnit 4
Ismael
 
Code Refactoring
Code RefactoringCode Refactoring
Code Refactoring
kim.mens
 
Automação de Testes com Robot Framework - GUTS-SC
Automação de Testes com Robot Framework - GUTS-SCAutomação de Testes com Robot Framework - GUTS-SC
Automação de Testes com Robot Framework - GUTS-SC
Mayara Fernandes
 
De a máxima cobertura nos seus testes de API
De a máxima cobertura nos seus testes de APIDe a máxima cobertura nos seus testes de API
De a máxima cobertura nos seus testes de API
Elias Nogueira
 
(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초
(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초
(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초
Jay Park
 
Acceptance Test Driven Development
Acceptance Test Driven DevelopmentAcceptance Test Driven Development
Acceptance Test Driven Development
Mike Douglas
 
Capabilities for Resources and Effects
Capabilities for Resources and EffectsCapabilities for Resources and Effects
Capabilities for Resources and Effects
Martin Odersky
 
Unit Testing And Mocking
Unit Testing And MockingUnit Testing And Mocking
Unit Testing And Mocking
Joe Wilson
 
Kotlin for Android Development
Kotlin for Android DevelopmentKotlin for Android Development
Kotlin for Android Development
Speck&Tech
 
Basics of MongoDB
Basics of MongoDB Basics of MongoDB
Basics of MongoDB
HabileLabs
 
API Test Automation Using Karate (Anil Kumar Moka)
API Test Automation Using Karate (Anil Kumar Moka)API Test Automation Using Karate (Anil Kumar Moka)
API Test Automation Using Karate (Anil Kumar Moka)
Peter Thomas
 
Implementação de Redes com Alta Disponibilidade
Implementação de Redes com Alta DisponibilidadeImplementação de Redes com Alta Disponibilidade
Implementação de Redes com Alta Disponibilidade
tiredes
 
CNIT 126 12: Covert Malware Launching
CNIT 126 12: Covert Malware LaunchingCNIT 126 12: Covert Malware Launching
CNIT 126 12: Covert Malware Launching
Sam Bowne
 
2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured2015-StarWest presentation on REST-assured
2015-StarWest presentation on REST-assured
Eing Ong
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Lionel Briand
 
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Databricks
 
Arquitetura básica de testes para seu projeto Java
Arquitetura básica de testes para seu projeto JavaArquitetura básica de testes para seu projeto Java
Arquitetura básica de testes para seu projeto Java
Elias Nogueira
 
AI assisted testing using postman and openAI.pdf
AI assisted testing using postman and openAI.pdfAI assisted testing using postman and openAI.pdf
AI assisted testing using postman and openAI.pdf
sivaganeshsivakumar1
 
An Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
An Introduction to JUnit 5 and how to use it with Spring boot tests and MockitoAn Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
An Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
shaunthomas999
 
JUnit 5 vs JUnit 4
JUnit 5 vs JUnit 4JUnit 5 vs JUnit 4
JUnit 5 vs JUnit 4
Ismael
 
Code Refactoring
Code RefactoringCode Refactoring
Code Refactoring
kim.mens
 
Automação de Testes com Robot Framework - GUTS-SC
Automação de Testes com Robot Framework - GUTS-SCAutomação de Testes com Robot Framework - GUTS-SC
Automação de Testes com Robot Framework - GUTS-SC
Mayara Fernandes
 
De a máxima cobertura nos seus testes de API
De a máxima cobertura nos seus testes de APIDe a máxima cobertura nos seus testes de API
De a máxima cobertura nos seus testes de API
Elias Nogueira
 
(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초
(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초
(책 소개) 가상 면접 사례로 배우는 대규모 시스템 설계 기초
Jay Park
 

Similar to Revisiting the Notion of Diversity in Software Testing (20)

Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
PgVector + : Enable Richer Interaction with vector database.pptx
PgVector + : Enable Richer Interaction with vector database.pptxPgVector + : Enable Richer Interaction with vector database.pptx
PgVector + : Enable Richer Interaction with vector database.pptx
aranjan11
 
Dissertation Data Fusion Summary Poster
Dissertation Data Fusion Summary PosterDissertation Data Fusion Summary Poster
Dissertation Data Fusion Summary Poster
Chris Ballard
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Testing Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal PerspectiveTesting Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal Perspective
Lionel Briand
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
Anubhav Jain
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand
 
Measuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation DatasetsMeasuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation Datasets
michaelaupetit1
 
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Daniel Roggen
 
Data_Prep_Techniques_Challenges_Methods.pdf
Data_Prep_Techniques_Challenges_Methods.pdfData_Prep_Techniques_Challenges_Methods.pdf
Data_Prep_Techniques_Challenges_Methods.pdf
Shailja Thakur
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
butest
 
The relationship between test and production code quality (@ SIG)
The relationship between test and production code quality (@ SIG)The relationship between test and production code quality (@ SIG)
The relationship between test and production code quality (@ SIG)
Maurício Aniche
 
Compeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptxCompeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptx
San Kim
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
Lionel Briand
 
Data Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification ToolData Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification Tool
Ivan Ruchkin
 
Thesis Giani UIC Slides EN
Thesis Giani UIC Slides ENThesis Giani UIC Slides EN
Thesis Giani UIC Slides EN
Marco Santambrogio
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
SAIL_QU
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
Alex Henderson
 
deep_Visualization in Data mining.ppt
deep_Visualization in Data mining.pptdeep_Visualization in Data mining.ppt
deep_Visualization in Data mining.ppt
PerumalPitchandi
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)
Fatimakhan325
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
PgVector + : Enable Richer Interaction with vector database.pptx
PgVector + : Enable Richer Interaction with vector database.pptxPgVector + : Enable Richer Interaction with vector database.pptx
PgVector + : Enable Richer Interaction with vector database.pptx
aranjan11
 
Dissertation Data Fusion Summary Poster
Dissertation Data Fusion Summary PosterDissertation Data Fusion Summary Poster
Dissertation Data Fusion Summary Poster
Chris Ballard
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Testing Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal PerspectiveTesting Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal Perspective
Lionel Briand
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
Anubhav Jain
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand
 
Measuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation DatasetsMeasuring the Validity of Clustering Validation Datasets
Measuring the Validity of Clustering Validation Datasets
michaelaupetit1
 
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...
Daniel Roggen
 
Data_Prep_Techniques_Challenges_Methods.pdf
Data_Prep_Techniques_Challenges_Methods.pdfData_Prep_Techniques_Challenges_Methods.pdf
Data_Prep_Techniques_Challenges_Methods.pdf
Shailja Thakur
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
butest
 
The relationship between test and production code quality (@ SIG)
The relationship between test and production code quality (@ SIG)The relationship between test and production code quality (@ SIG)
The relationship between test and production code quality (@ SIG)
Maurício Aniche
 
Compeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptxCompeition-Level Code Generation with AlphaCode.pptx
Compeition-Level Code Generation with AlphaCode.pptx
San Kim
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
Lionel Briand
 
Data Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification ToolData Generation with PROSPECT: a Probability Specification Tool
Data Generation with PROSPECT: a Probability Specification Tool
Ivan Ruchkin
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
SAIL_QU
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
Alex Henderson
 
deep_Visualization in Data mining.ppt
deep_Visualization in Data mining.pptdeep_Visualization in Data mining.ppt
deep_Visualization in Data mining.ppt
PerumalPitchandi
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)
Fatimakhan325
 
Ad

More from Lionel Briand (20)

FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System LogsPRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
Lionel Briand
 
Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
Lionel Briand
 
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System LogsPRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
Lionel Briand
 
Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
Lionel Briand
 
Ad

Recently uploaded (20)

Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025Adobe Master Collection CC Crack Advance Version 2025
Adobe Master Collection CC Crack Advance Version 2025
kashifyounis067
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)
Andre Hora
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Download YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full ActivatedDownload YouTube By Click 2025 Free Full Activated
Download YouTube By Click 2025 Free Full Activated
saniamalik72555
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 

Revisiting the Notion of Diversity in Software Testing

  • 1. Revisiting the Notion of Diversity in Software Testing Lionel Briand SBFT 2023 Keynote http://www.lbriand.info
  • 2. Why Diversity? • Diverse test cases • Exercising the system to the largest extent possible within a budget • Increase probability of fault detection • While working with incomplete knowledge • Cost of acquiring information • Missing information 2
  • 3. Example: Fuzzing with AFL 3 Diversity mechanisms: Mutation, coverage Credits: Antonio Morales, https://github.com/antonio-morales/Fuzzing101
  • 4. Aspects of Diversity 4 SUT Inputs Outputs Execution (internal): - Structural coverage - Model coverage (e.g., states)
  • 5. Questions • What aspects of diversity to focus on? • Information access • Information cost, e.g., execution time • Context-dependent • How to measure diversity? • Representation (e.g., inputs) • Distance measure, e.g., cosine, edit • Computational cost • Guidance, e.g., in search • How to maximize diversity? • Mutation, metaheuristic search, symbolic execution … • Issues: cost, scalability, bias, effectiveness 5
  • 6. Aspects of Diversity • Inputs: No instrumentation, does not require the execution of the SUT • Outputs: No instrumentation, execution required but directly characterizes the behavior of the SUT • Internal SUT structure: Instrumentation, possibly modeling, additional execution cost and significant data storage 6
  • 7. Example: Testing DNNs • Redundant or invalid inputs • Labeling cost is high • Domain-specific knowledge is required to manually label test inputs • Cost of test execution can be high • Coverage ineffective • Test selection based on inputs 7 Aghababaeyan et al., 2023
  • 8. Example: Testing DNNs We want to test a DNN model with a fixed test budget. • How can we automatically select a candidate test subset with high-fault revealing power to test DNNs? • Black-box test selection based on input diversity. 8 Black-box test selection method Test inputs T Subset S⊆T
  • 9. Example: Testing DNNs • No model execution • No access to model internals or training set • Studies show that proposed coverage measures for DNNs not associated with faults • Solution: Geometric diversity of image features 9
  • 10. Extracting Image Features • VGG16 is a convolutional neural network trained on a subset of the ImageNet dataset, a collection of over 14 million images belonging to 22,000 categories. 10 Features: - Activation values after last convolutional layer - Characterize semantic elements such as shapes and colors
  • 11. Geometric Diversity (GD) • Given a dataset X and its corresponding feature vectors V, the geometric diversity of a subset S ⊆ X is defined as the hyper-volume of the parallelepiped spanned by the rows of Vs, i.e., feature vectors of items in S, where the larger the volume, the more diverse is the feature space of S 11 Aghababaeyan et al., 2023
  • 12. Measuring Diversity • Representation and measure: Construct validity? • Cost of computing diversity • Guidance provided by diversity, e.g., test selection search 12
  • 13. Example: Test Minimization • Permanently remove redundant test cases in a test suite that are unlikely to detect new faults • Black-box versus white-box techniques • FAST-R: Quick and black-box, but low fault detection rates • ATM: Abstract Syntax Tree (AST)-based Test case Minimizer • Motivation: Achieve a better trade-off between effectiveness and efficiency than FAST-R • Context: Minimization only applied to major releases 13
  • 14. Example: ATM • Representation: AST of pre-processed test code • Tree similarity measures: top-down, bottom-up, combined, edit distance • Common subtree isomorphism algorithms • Top-down and bottom-up emphasize different aspects of similarity between ASTs 14 Transform test code to ASTs Test Suite Measure test case similarity Run search algorithms Minimized test suite Pre-process test code 4 tree-based similarity measures GA & NSGA-II Pan et al., 2023
  • 15. Example: ATM • Alternatives evaluated in terms of Fault Detection Rate (FDR) • Edit distance is expensive but offers good guidance • Combined similarity not significantly different • Multi-objective search more expensive • Much higher fault detection than FAST-R in significantly higher execution time, though practical up to an extent 15 GA NSGA-II Top-Down Bottom-Up Combined Tree Edit Distance Top-Down & Bottom-Up Combined & Tree Edit Distance FDR 0.78 Time 70.87 FDR 0.74 Time 67.05 FDR 0.80 Time 72.75 FDR 0.81 Time 82.23 FDR 0.78 Time 235.41 FDR 0.82 Time 258.44
  • 16. Example: Input Diversity in DNNs • Alternative diversity measures: Geometrics Diversity, Normalized Compression Distance (NCD), standard deviation • Construct validity? • Analysis: • We study how diversity scores change while varying the number of classes or concepts inside the images of the input sets. • We assume that diversity scores should increase with the number of classes or concepts that are present in an input set. 16
  • 17. Example: Input Diversity in DNNs • Geometrics Diversity shows a clear monotonic relationship with the number of classes in the input set 17 11 (a) Evolution of GD on Cifar-10 (b) Evolution of STD on Cifar-10 (c) Evolution of NCD on Cifar-10 (d) Evolution of GD on MNIST (e) Evolution of STD on MNIST (f) Evolution of NCD on MNIST Figure 8: Evolution of the diversity scores for input sets from Cifar-10 and MNIST. Each boxplot shows the distribution of diversity scores of 20 input sets of size 100. This article has been accepted for publication in IEEE Transactions on Software Engineering. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/TSE.2023.3243522 Aghababaeyan et al., 2023
  • 18. Applications • Test minimization, selection, prioritization • Mutation analysis • Identify boundaries in the input space, e.g., safe vs unsafe 18
  • 19. MASS: CPS Mutation Testing 19 Create mutants Compile mutants Killed Mutants Live Mutants 2 Collect test data 1 Code Coverage Remove equivalent/duplicate based on compiler optimizations 4 3 Mutants Code coverage Mutants successfully compiled Unique mutants Evaluate mutation score’s confidence Sampled mutants Sample mutants Execute prioritized subset of test cases 5 6 7 Cornejo et al., 2021 • Selection and prioritization of test cases based on statement coverage • Test suite prioritization: • Greedy algorithm • Select first the test case that largely differ from the most similar, already selected, test case • Test suite reduction: exclude test cases with perfect similarity
  • 20. MASS: CPS Mutation Testing • Compare the sets of source code statements that have been covered by test cases: Jaccard and Ochiai • Compare the number of times each statement has been covered by test cases: Euclidian, cosine • Focus on functions in source file where mutated statement located • Best: Cosine distance • Difference in mutation score < 5% 20 Create mutants Compile mutants Killed Mutants Live Mutants 2 Collect test data 1 Code Coverage Remove equivalent/duplicate based on compiler optimizations 4 3 Mutants Code coverage Mutants successfully compiled Unique mutants Evaluate mutation score’s confidence Sampled mutants Sample mutants Execute prioritized subset of test cases 5 6 7 Reduction in mutation analysis time > 70%
  • 21. Explanations for DNN Errors (SEDE) Can we explain DNN failures of real-world images using simulator parameters? 21 Training Set Simulator Images DNN Training Test Set Simulator Images DNN Testing DNN Training (fine-tuning) Training Set Real-world Images DNN Testing Trained DNN Fine-Tuned DNN Real-world Error Inducing Images Test Set Real-world Images
  • 22. SEDE 22 Real-world Error-inducing images HUDD Evolutionary Algorithms Simulator Simulator images Configuration Parameters RCC Prototype Images Step 1. Identify root-cause clusters (RCCs) Step 2. Generate images associated to RCCs RCCs Step 2.1. Identify RCC Prototype Images Step 2.2. Generate a set of unsafe images belonging to the cluster Step 2.3. Generate one safe image for each unsafe image PaiR Error-inducing Test Set images Step1. Heatmap based clustering Root cause clusters C1 C2 C3 Step 2. Inspection of subset of cluster elements. HUDD: Fahmy et al. 2021 Cluster 2 (near closed eyes) incomplete training set Cluster 1 (angle ~157.5) borderline cases
  • 23. SEDE 23 Synthetic images Parameters-based Description Improved DNN model Retraining: +18.6% Real-World Images HeadPosex > 10 & HeadPosey > 50.34 Real-world images Diverse simulator images, within the cluster Diverse failing simulator images, close to these images Passing simulator images, close to failing ones in cluster S1 S2 S3 Cluster Process: Simulator-based Explanations for DNN Errors (SEDE) Real-world Error-inducing images HUDD Evolutionary Algorithms Simulator Simulator images Configuration Parameters RCC Prototype Images Step 1. Identify root-cause clusters (RCCs) Step 2. Generate images associated to RCCs RCCs Step 2.1. Identify RCC Prototype Images Step 2.2. Generate a set of unsafe images belonging to the cluster Step 2.3. Generate one safe image for each unsafe image PaiR Fahmy et al. 2022
  • 24. WCET for Critical Tasks 24 • Real-time systems • Schedulability analysis verifies time constraints for critical tasks • Early schedulability analysis and design decisions require early task Worst Case Execution Time (WCET) estimates • Challenges: Tasks not fully implemented, worst case inputs unknown • Goal: Estimating Probabilistic Safe WCET Ranges at Design Stages
  • 25. SAFE: WCET boundaries • Safe WCET boundaries: implementation objectives, evaluate design options • Iterative, distance-based sampling of WCET values within ranges 25 Phase 1. Worst-case task arrivals analysis Phase 2. Safe WCET computation Training dataset Worst-case sequences of task arrivals Task descriptions Search Learning Safe Unsafe WCET T1 WCET T2 SAFE: Safe WCET Analysis method For real-time task schEdulability (Lee et al., 2022)
  • 26. Use of Language Models • Code (e.g., test) or trace vector representation (encoding) • Benefit from pre-trained language models, e.g., CodeBERT • Example: test case prioritization (Test2Vec, Jabbar et al., 2022), minimization • Embedding test execution traces with fine-tuned CodeBERT • Fined-tuned with pass/fail labels for past test cases in a system • Test2Vec maps test execution traces, i.e., sequences of method calls with their inputs and return values, to fixed-length, numerical vectors • Heuristic: Similarity to previous failing test cases in the same project 26
  • 28. Conclusions • Many applications of diversity in testing • Various aspects warrant different solutions: information access, execution cost, instrumentation cost • Trade-off between representations (test cases) and distance/similarity measures: computation cost, guidance • Determining the best solution can only be done empirically, in a well-defined (application) context • Check assumptions and properties of distance/similarity measures, e.g., desired sensitivity to change • Scalability is usually the stumbling block for many applications 28
  • 29. Selected References • Cornejo et al., “Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results in the Space Domain”, IEEE Transactions on Software Engineering, 2021 • Fahmy et al. "Supporting DNN Safety Analysis and Retraining through Heatmap-based Unsupervised Learning" IEEE Transactions on Reliability, Special section on Quality Assurance of Machine Learning Systems, 2021 • Fahmy et al. "Simulator-based explanation and debugging of hazard-triggering events in DNN-based safety-critical systems”, ACM TOSEM, 2022 • Attaoui et al., “Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction and Clustering”, ACM TOSEM, 2022 • Pan et al., , “ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolutionary Search”, IEEE /ACM ICSE 2023, • Lee et al., “Estimating probabilistic safe wcet ranges of real-time systems at design stages”, ACM TOSEM 2022 • Jabbar et al., Test2Vec: An Execution Trace Embedding for Test Case Prioritization, ArXIV, 2022 • Aghababaeyan et al., “Black-Box Testing of Deep Neural Networks through Test Case Diversity”, IEEE Transactions on Software Engineering, 2023 • Aghababaeyan et al., “DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks”, https://arxiv.org/abs/2303.04878 29
  • 30. Looking for Postdocs! Lionel Briand SBFT 2023 Keynote http://www.lbriand.info