SlideShare a Scribd company logo
Robustness in Deep Learning
Murari Mandal
Postdoctoral Researcher
National University of Singapore (NUS)
https://murarimandal.github.io
“robustness”?
- the ability to withstand or overcome adverse conditions or rigorous
testing.
• Are the current deep learning models robust?
• Adversarial example: An input data point that is slightly
perturbed by an adversarial perturbation causing failure in
the deep learning system.
Robustness
The AI Breakthroughs
Vinyals et al. “Grandmaster level in StarCraft II using
multi-agent reinforcement learning”
Redmon et al. “YOLO9000: Better, Faster, Stronger”
https://github.com/facebookresearch/detectron2
Higher Stakes?
Tang et al. “Data Valuation for Medical Imaging Using Shapley Value: Application on A Large-scale Chest X-ray dataset”
https://scale.com/
Autonomous Driving
Eijgelaar et al. “Robust Deep Learning–based Segmentation
of Glioblastoma on Routine Clinical MRI Scans…”
Better Performance!
Performance of winning entries in the ImageNet
from 2011 to 2017 in the image classification
task.
Liu et al. “Deep Learning for Generic Object Detection: A Survey”
Evolution of object detection performance on
COCO (Test-Dev results)
• The degrees of robustness or adaptability is quite low!
• Human Perception Vs Machine/Deep Learning
Performance?
Are the Models Robust?
Results of different patches, trained on COCO, tested on the person category of different datasets.
Wu et al. “Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors”
Adversarial samples Clean samples
• Deep neural networks have been shown to be vulnerable to
adversarial examples.
• Maliciously perturbed inputs that cause DNNs to produce
incorrect predictions.
Adversarial Attacks
Madry et al.
Goodfellow et al. “Explaining
and Harnessing Adversarial
Examples”
• Adversarial robustness poses a significant challenge for the
deployment of ML-based systems.
• Specially safety- and security-critical environments like
autonomous driving, disease detection or unmanned aerial
vehicles, etc.
Adversarial Attacks
Joysua Rao "Robust Machine Learning Algorithms and Systems for Detection and Mitigation of Adversarial Attacks and Anomalies”
• How to fool a machine learning model?
• How to create the adversarial perturbation? Threat model
• What is the attack strategy for the perturbation at hand?
Attack Strategy
Adversarial Attacks
• What are the desired consequences of the adversarial
perturbation?
• Untargeted (Non-targeted): As many misclassifications as
possible. No preference concerning the appearing classes in the
adversarial output.
• Static Target: Fixed classification output. Example: Forcing the
model to output one fixed image of an empty street without any
pedestrians or cars in sight.
• Dynamic Target: Keep the output unchanged with the exception
of removing certain target classes. Example: Removing the
pedestrian class in every possible traffic situation.
• Confusing Target (Confusion): Change the position or size of
certain target classes. Example: Reduces the size of pedestrians
and in this way leads to a false sense of distance.
Adversarial Attacks: Threat Model
Adversarial Attacks: Threat Model
Assion et al. "The Attack Generator: A Systematic Approach Towards Constructing Adversarial Attacks"
Yuan et al. "Adversarial Examples: Attacks and Defenses for Deep Learning"
• Perturbation Scope:
• Individual Scope: Attack is designed for one specific input image. It
is not necessary that the same perturbation fools the ML system on
other data points.
• Contextual Scope: Image agnostic perturbation that causes label
changes for one or more specific contextual situations. Example,
traffic, rain, lighting change, camera angles, etc.
• Universal Scope: Image agnostic perturbation that causes label
changes for a significant part of the true data distribution with no
explicit contextual dependencies.
Adversarial Attacks: Threat Model
• Perturbation Scope:
Adversarial Attacks: Threat Model
Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation"
• Perturbation Imperceptibility:
• Lp-based Imperceptibility: Small changes with respect to some Lp-
norm, the changes should be imperceptible to human eyes.
• Attention-based Imperceptibility: Wasserstein distance, SSIM or
other metric based imperceptibility.
• Output Imperceptibility: The classification output is imperceptible
to the human observer.
• Detector Imperceptibility: A predefined selection of software-based
detection systems is not able to detect irregularities in the input,
output or in the activation patterns of the ML module caused by
the adversarial perturbation.
Adversarial Attacks: Threat Model
• Perturbation Imperceptibility:
Adversarial Attacks: Threat Model
Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation"
• Model Knowledge:
• White-box: Full knowledge of the model internals: architecture,
parameters, weight configurations, training strategy.
• Output-transparent Black-box: No access to model parameters. But
can observe the class probabilities or output logits of the module.
• Query-limited Black-box: Access to the full or parts of the module’s
output on a limited number of inputs or with a limited frequency.
• Label-only Black-box: Only access to the full or parts of the final
classification/regression decisions of the system.
• (Full) Black-box: No access to the model of any kind.
Adversarial Attacks: Threat Model
• Data Knowledge:
• Training Data: Access to full of significant part of training data
• Surrogate Data: No direct access. But data points can be collected
from the relevant underlying data distribution.
• Adversary Capability:
• Digital Data Feed (Direct Data Feed): The attacker can directly feed
digital input to the model.
• Physical Data Feed: Creates physical perturbations in the
environment.
• Spatial Constraint: Only influence limited areas of the input data.
Adversarial Attacks: Threat Model
• Model Basis: Which model is used by the attack?
• Victim Model: Use the victim model to calculate adversarial
perturbations.
• Surrogate Model: Use a surrogate model or a different model.
• Data Basis: What data is used by the attack?
• Training Data: Original training data set are given to the adversarial
attack.
• Surrogate Data: Data related to the underlying data distribution of
the task.
• No Data: Attack works with images that are not samples of the
present data distribution.
Adversarial Attacks: Attack Strategy
• Optimization Method:
• First-order Methods: Exploit perturbation directions given by exact
or approximate (sub-)gradients.
• Second-order Methods: Based on the calculation of the Hessian
matrix or approximations of the Hessian matrix.
• Evolution & Random Sampling: The adversarial attack generates
possible perturbations by sampling distributions and combining
promising candidates.
Adversarial Attacks: Attack Strategy
• Some of the representative approaches for generating
adversarial examples
• Fast Gradient Sign Method (FGSM)
• Basic Iterative Method (BIM)
• Iterative Least-Likely Class Method (ILLC)
• Jacobian-based Saliency Map Attack (JSMA)
• DeepFool
• CPPN EA Fool
• Projected Gradient Descent (PGD)
• Carlini and Wagner (C&W) attack
• Adversarial patch attack
Adversarial Attacks: Attack Strategy
Attacks on Image Classification
Duan et al. “Adversarial Camouflage: Hiding
Physical-World Attacks with Natural Styles”
Attacks on Image Classification
Lu et al. "Enhancing Cross-Task Black-Box Transferability of
Adversarial Examples with Dispersion Reduction
https://openai.com/blog/multimodal-neurons/
Shamsabadi, et al. “ColorFool Semantic
Adversarial Colorization”
Attacks on Image Classification
Kantipudi et al. “Color Channel Perturbation Attacks for Fooling Convolutional Neural Networks and A Defense Against Such Attacks”
Attacks on Object Detector
Xu et al. "Adversarial T-shirt! Evading Person Detectors in A Physical World"
Attacks on Object Detector
Xu et al. "Adversarial T-shirt! Evading Person Detectors in A Physical World"
Zhang et al. "Contextual Adversarial Attacks for Object Detection"
Duan et al. "Adversarial Camouflage: Hiding Physical-
World Attacks with Natural Styles"
Eykholt et al. “Physical Adversarial Examples for Object Detectors”
The poster attack on Yolov2
Attacks on Object Detector
Eykholt et al. “Physical Adversarial Examples for Object Detectors”
The sticker attack on Yolov2
Attacks on Object Detector
The YOLOv2 detector is evaded using a pattern trained on the COCO dataset with a
carefully constructed objective.
Wu et al. “Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors”
Attacks on Object Detector
• Semantic segmentation networks are harder to break.
• Due their multi-scale encoder decoder structure and output
as per pixel probability instead of just probability score for
the whole image.
Attacks on Semantic Segmentation
Attacks on Semantic Segmentation
Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation"
Why do adversarial examples exist?
Ilyas et al. “Adversarial examples are not bugs, they are features”
Adversarial examples can be attributed to the presence of non-robust features
• We can use the knowledge about the adversarial attacks to
improve the model robustness.
• Why to evaluate the robustness?
• To defend against an adversary who will attack the system.
• For example, an attacker may wish to cause a self-driving car to
incorrectly recognize road signs.
• Cause an NSFW detector to incorrectly recognize an image as safe-
for-work.
• Cause a malware (or spam) classifier to identify a malicious file (or
spam email) as benign.
• Cause an ad-blocker to incorrectly identify an advertisement as
natural content
• Cause a digital assistant to incorrectly recognize commands it is
given.
Adversarial Robustness
• To test the worst-case robustness of machine learning
algorithms.
• Many real-world environments have inherent randomness that is
difficult to predict.
• Analyzing the worst-case robustness will cover minor perturbation
cases.
• To measure progress of machine learning algorithms
towards human-level abilities.
• In terms of normal performance, Gap is <<<< between
Human Vs Machine.
• In adversarial robustness, Gap >>>> between Human Vs
Machine.
Adversarial Robustness
• Reactive defenses: Preprocessing techniques, detection for
adversarial samples.
• Detection of adversarial examples
• Input transformations (preprocessing)
• Obfuscation defenses: Try to hide or obfuscate sensitive
traits of a model (e.g. gradients) to alleviate the impact of
adversarial examples.
• Gradient masking
Defense Against Adversarial Attacks
• Proactive defenses: Build and train models natively robust
to adversarial perturbations.
• Adversarial training
• Architectural defenses
• Learning in a min-max setting
• Hyperparameter tuning
• Generative models (GAN) based defense
• Provable adversarial defenses
• What is missing?
• A uniform protocol for defense evaluation
Defense Against Adversarial Attacks
• Protect your Identity in public places.
Adversarial Attacks & Privacy?
https://www.inovex.de/blog/machine-perception-face-recognition/
• Stopping unauthorized exploitation of personal data for
training commercial models.
• Protect your privacy.
• Can data be made unlearnable for deep learning models?
Adversarial Attacks & Privacy?
Huang et al. “Unlearnable Examples: Making Personal Data Unexploitable”
• Adversarial Attacks and defense– A very important
challenge for AI research.
• The existence of adversarial cases depend on the
applications – classification, detection, segmentation, etc.
• How many adversarial samples are out there? Impossible to
know.
• Need to revisit the current practice of reporting standard
performance. Adversarial robust performance matters!
• Robustness of ML/DL models must be evaluated with
adversarial examples.
• Adversarial attacks for a good cause – improving privacy.
Takeaways
• Grebner et al. “The Attack Generator: A Systematic Approach Towards Constructing
Adversarial Attacks”
• Arnab et al. "On the Robustness of Semantic Segmentation Models to Adversarial Attacks“
• Liu et al. “Deep Learning for Generic Object Detection: A Survey”
• Wu et al. “Making an Invisibility Cloak: Real World Adversarial Attacks on Object
Detectors”
• Assion et al. "The Attack Generator: A Systematic Approach Towards Constructing
Adversarial Attacks"
• Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation"
• Xu et al. "Adversarial T-shirt! Evading Person Detectors in A Physical World"
• Duan et al. "Adversarial Camouflage: Hiding Physical-World Attacks with Natural Styles“
• Serban et al. "Adversarial Examples - A Complete Characterisation of the Phenomenon“
References
Thank You!

More Related Content

What's hot (20)

PDF
Image segmentation with deep learning
Antonio Rueda-Toicen
 
PPTX
Introduction to Grad-CAM (complete version)
Hsing-chuan Hsieh
 
PPTX
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
PDF
Evolution of the StyleGAN family
Vitaly Bondar
 
PPTX
Style gan
哲东 郑
 
PDF
Basic Generative Adversarial Networks
Dong Heon Cho
 
PDF
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...
Sri Ambati
 
PDF
Self-Attention with Linear Complexity
Sangwoo Mo
 
PDF
Explainable AI (XAI)
Manojkumar Parmar
 
PPTX
Transfer learning-presentation
Bushra Jbawi
 
PDF
Introduction to Deep learning
Massimiliano Ruocco
 
PDF
220206 transformer interpretability beyond attention visualization
taeseon ryu
 
PPTX
Interpretable Machine Learning
Sri Ambati
 
PPTX
Lecture 6: Ensemble Methods
Marina Santini
 
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 
PDF
Generative Adversarial Networks and Their Applications
Artifacia
 
PDF
Adversarial examples in deep learning (Gregory Chatel)
MeetupDataScienceRoma
 
PDF
Generative adversarial networks
남주 김
 
PPTX
Learning from imbalanced data
Aboul Ella Hassanien
 
PDF
Security and Privacy of Machine Learning
Priyanka Aash
 
Image segmentation with deep learning
Antonio Rueda-Toicen
 
Introduction to Grad-CAM (complete version)
Hsing-chuan Hsieh
 
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
Evolution of the StyleGAN family
Vitaly Bondar
 
Style gan
哲东 郑
 
Basic Generative Adversarial Networks
Dong Heon Cho
 
Achieving Algorithmic Transparency with Shapley Additive Explanations (H2O Lo...
Sri Ambati
 
Self-Attention with Linear Complexity
Sangwoo Mo
 
Explainable AI (XAI)
Manojkumar Parmar
 
Transfer learning-presentation
Bushra Jbawi
 
Introduction to Deep learning
Massimiliano Ruocco
 
220206 transformer interpretability beyond attention visualization
taeseon ryu
 
Interpretable Machine Learning
Sri Ambati
 
Lecture 6: Ensemble Methods
Marina Santini
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 
Generative Adversarial Networks and Their Applications
Artifacia
 
Adversarial examples in deep learning (Gregory Chatel)
MeetupDataScienceRoma
 
Generative adversarial networks
남주 김
 
Learning from imbalanced data
Aboul Ella Hassanien
 
Security and Privacy of Machine Learning
Priyanka Aash
 

Similar to Robustness in deep learning (20)

PDF
aml.pdf
SheilaAlemany
 
PPTX
Interactive ML.pptx XAI by Radhika selvamani
tanilkumarreddy173
 
PDF
Microsoft Research Faculty Summit - AI and Security
ssusere6073a
 
PPTX
Subverting Machine Learning Detections for fun and profit
Ram Shankar Siva Kumar
 
PPTX
What Is Adversarial Machine Learning.pptx
MaisamAbbas14
 
PDF
Machine Duping 101: Pwning Deep Learning Systems
Clarence Chio
 
PDF
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
PDF
Adversarial ML - Part 1.pdf
KSChidanandKumarJSSS
 
PDF
Bringing Red vs. Blue to Machine Learning
Bobby Filar
 
PPTX
Group 10 - DNN Presentation for UOM.pptx
DanNiles4
 
PDF
Presentation by Lionel Briand
Ptidej Team
 
PDF
DEF CON 24 - Clarence Chio - machine duping 101
Felipe Prado
 
PDF
Adversarial Attacks and Defenses in Deep Learning.pdf
MichelleHoogenhout
 
PPTX
Adversarial Training is all you Need.pptx
Prerana Khatiwada
 
PPTX
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Kishor Datta Gupta
 
PDF
Adversarial ML - Part 2.pdf
KSChidanandKumarJSSS
 
PDF
AI model security and robustness
Rajib Biswas
 
PDF
Testing Machine Learning-enabled Systems: A Personal Perspective
Lionel Briand
 
PPTX
Adversarial Machine Learning in Cybersecurity.pptx
silaya6647
 
PDF
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
GeekPwn Keen
 
aml.pdf
SheilaAlemany
 
Interactive ML.pptx XAI by Radhika selvamani
tanilkumarreddy173
 
Microsoft Research Faculty Summit - AI and Security
ssusere6073a
 
Subverting Machine Learning Detections for fun and profit
Ram Shankar Siva Kumar
 
What Is Adversarial Machine Learning.pptx
MaisamAbbas14
 
Machine Duping 101: Pwning Deep Learning Systems
Clarence Chio
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Adversarial ML - Part 1.pdf
KSChidanandKumarJSSS
 
Bringing Red vs. Blue to Machine Learning
Bobby Filar
 
Group 10 - DNN Presentation for UOM.pptx
DanNiles4
 
Presentation by Lionel Briand
Ptidej Team
 
DEF CON 24 - Clarence Chio - machine duping 101
Felipe Prado
 
Adversarial Attacks and Defenses in Deep Learning.pdf
MichelleHoogenhout
 
Adversarial Training is all you Need.pptx
Prerana Khatiwada
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Kishor Datta Gupta
 
Adversarial ML - Part 2.pdf
KSChidanandKumarJSSS
 
AI model security and robustness
Rajib Biswas
 
Testing Machine Learning-enabled Systems: A Personal Perspective
Lionel Briand
 
Adversarial Machine Learning in Cybersecurity.pptx
silaya6647
 
PRACTICAL ADVERSARIAL ATTACKS AGAINST CHALLENGING MODELS ENVIRONMENTS - Moust...
GeekPwn Keen
 
Ad

More from Ganesan Narayanasamy (20)

PDF
Empowering Engineering Faculties: Bridging the Gap with Emerging Technologies
Ganesan Narayanasamy
 
PDF
Chip Design Curriculum development Residency program
Ganesan Narayanasamy
 
PDF
Basics of Digital Design and Verilog
Ganesan Narayanasamy
 
PDF
180 nm Tape out experience using Open POWER ISA
Ganesan Narayanasamy
 
PDF
Workload Transformation and Innovations in POWER Architecture
Ganesan Narayanasamy
 
PDF
OpenPOWER Workshop at IIT Roorkee
Ganesan Narayanasamy
 
PDF
Deep Learning Use Cases using OpenPOWER systems
Ganesan Narayanasamy
 
PDF
IBM BOA for POWER
Ganesan Narayanasamy
 
PDF
OpenPOWER System Marconi100
Ganesan Narayanasamy
 
PDF
OpenPOWER Latest Updates
Ganesan Narayanasamy
 
PDF
POWER10 innovations for HPC
Ganesan Narayanasamy
 
PDF
Deeplearningusingcloudpakfordata
Ganesan Narayanasamy
 
PDF
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
Ganesan Narayanasamy
 
PDF
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
Ganesan Narayanasamy
 
PDF
AI in healthcare - Use Cases
Ganesan Narayanasamy
 
PDF
AI in Health Care using IBM Systems/OpenPOWER systems
Ganesan Narayanasamy
 
PDF
AI in Healh Care using IBM POWER systems
Ganesan Narayanasamy
 
PDF
Poster from NUS
Ganesan Narayanasamy
 
PDF
SAP HANA on POWER9 systems
Ganesan Narayanasamy
 
PPTX
Graphical Structure Learning accelerated with POWER9
Ganesan Narayanasamy
 
Empowering Engineering Faculties: Bridging the Gap with Emerging Technologies
Ganesan Narayanasamy
 
Chip Design Curriculum development Residency program
Ganesan Narayanasamy
 
Basics of Digital Design and Verilog
Ganesan Narayanasamy
 
180 nm Tape out experience using Open POWER ISA
Ganesan Narayanasamy
 
Workload Transformation and Innovations in POWER Architecture
Ganesan Narayanasamy
 
OpenPOWER Workshop at IIT Roorkee
Ganesan Narayanasamy
 
Deep Learning Use Cases using OpenPOWER systems
Ganesan Narayanasamy
 
IBM BOA for POWER
Ganesan Narayanasamy
 
OpenPOWER System Marconi100
Ganesan Narayanasamy
 
OpenPOWER Latest Updates
Ganesan Narayanasamy
 
POWER10 innovations for HPC
Ganesan Narayanasamy
 
Deeplearningusingcloudpakfordata
Ganesan Narayanasamy
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
Ganesan Narayanasamy
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
Ganesan Narayanasamy
 
AI in healthcare - Use Cases
Ganesan Narayanasamy
 
AI in Health Care using IBM Systems/OpenPOWER systems
Ganesan Narayanasamy
 
AI in Healh Care using IBM POWER systems
Ganesan Narayanasamy
 
Poster from NUS
Ganesan Narayanasamy
 
SAP HANA on POWER9 systems
Ganesan Narayanasamy
 
Graphical Structure Learning accelerated with POWER9
Ganesan Narayanasamy
 
Ad

Recently uploaded (20)

PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 

Robustness in deep learning

  • 1. Robustness in Deep Learning Murari Mandal Postdoctoral Researcher National University of Singapore (NUS) https://murarimandal.github.io
  • 2. “robustness”? - the ability to withstand or overcome adverse conditions or rigorous testing. • Are the current deep learning models robust? • Adversarial example: An input data point that is slightly perturbed by an adversarial perturbation causing failure in the deep learning system. Robustness
  • 3. The AI Breakthroughs Vinyals et al. “Grandmaster level in StarCraft II using multi-agent reinforcement learning” Redmon et al. “YOLO9000: Better, Faster, Stronger” https://github.com/facebookresearch/detectron2
  • 4. Higher Stakes? Tang et al. “Data Valuation for Medical Imaging Using Shapley Value: Application on A Large-scale Chest X-ray dataset” https://scale.com/ Autonomous Driving Eijgelaar et al. “Robust Deep Learning–based Segmentation of Glioblastoma on Routine Clinical MRI Scans…”
  • 5. Better Performance! Performance of winning entries in the ImageNet from 2011 to 2017 in the image classification task. Liu et al. “Deep Learning for Generic Object Detection: A Survey” Evolution of object detection performance on COCO (Test-Dev results)
  • 6. • The degrees of robustness or adaptability is quite low! • Human Perception Vs Machine/Deep Learning Performance? Are the Models Robust? Results of different patches, trained on COCO, tested on the person category of different datasets. Wu et al. “Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors” Adversarial samples Clean samples
  • 7. • Deep neural networks have been shown to be vulnerable to adversarial examples. • Maliciously perturbed inputs that cause DNNs to produce incorrect predictions. Adversarial Attacks Madry et al. Goodfellow et al. “Explaining and Harnessing Adversarial Examples”
  • 8. • Adversarial robustness poses a significant challenge for the deployment of ML-based systems. • Specially safety- and security-critical environments like autonomous driving, disease detection or unmanned aerial vehicles, etc. Adversarial Attacks Joysua Rao "Robust Machine Learning Algorithms and Systems for Detection and Mitigation of Adversarial Attacks and Anomalies”
  • 9. • How to fool a machine learning model? • How to create the adversarial perturbation? Threat model • What is the attack strategy for the perturbation at hand? Attack Strategy Adversarial Attacks
  • 10. • What are the desired consequences of the adversarial perturbation? • Untargeted (Non-targeted): As many misclassifications as possible. No preference concerning the appearing classes in the adversarial output. • Static Target: Fixed classification output. Example: Forcing the model to output one fixed image of an empty street without any pedestrians or cars in sight. • Dynamic Target: Keep the output unchanged with the exception of removing certain target classes. Example: Removing the pedestrian class in every possible traffic situation. • Confusing Target (Confusion): Change the position or size of certain target classes. Example: Reduces the size of pedestrians and in this way leads to a false sense of distance. Adversarial Attacks: Threat Model
  • 11. Adversarial Attacks: Threat Model Assion et al. "The Attack Generator: A Systematic Approach Towards Constructing Adversarial Attacks" Yuan et al. "Adversarial Examples: Attacks and Defenses for Deep Learning"
  • 12. • Perturbation Scope: • Individual Scope: Attack is designed for one specific input image. It is not necessary that the same perturbation fools the ML system on other data points. • Contextual Scope: Image agnostic perturbation that causes label changes for one or more specific contextual situations. Example, traffic, rain, lighting change, camera angles, etc. • Universal Scope: Image agnostic perturbation that causes label changes for a significant part of the true data distribution with no explicit contextual dependencies. Adversarial Attacks: Threat Model
  • 13. • Perturbation Scope: Adversarial Attacks: Threat Model Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation"
  • 14. • Perturbation Imperceptibility: • Lp-based Imperceptibility: Small changes with respect to some Lp- norm, the changes should be imperceptible to human eyes. • Attention-based Imperceptibility: Wasserstein distance, SSIM or other metric based imperceptibility. • Output Imperceptibility: The classification output is imperceptible to the human observer. • Detector Imperceptibility: A predefined selection of software-based detection systems is not able to detect irregularities in the input, output or in the activation patterns of the ML module caused by the adversarial perturbation. Adversarial Attacks: Threat Model
  • 15. • Perturbation Imperceptibility: Adversarial Attacks: Threat Model Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation"
  • 16. • Model Knowledge: • White-box: Full knowledge of the model internals: architecture, parameters, weight configurations, training strategy. • Output-transparent Black-box: No access to model parameters. But can observe the class probabilities or output logits of the module. • Query-limited Black-box: Access to the full or parts of the module’s output on a limited number of inputs or with a limited frequency. • Label-only Black-box: Only access to the full or parts of the final classification/regression decisions of the system. • (Full) Black-box: No access to the model of any kind. Adversarial Attacks: Threat Model
  • 17. • Data Knowledge: • Training Data: Access to full of significant part of training data • Surrogate Data: No direct access. But data points can be collected from the relevant underlying data distribution. • Adversary Capability: • Digital Data Feed (Direct Data Feed): The attacker can directly feed digital input to the model. • Physical Data Feed: Creates physical perturbations in the environment. • Spatial Constraint: Only influence limited areas of the input data. Adversarial Attacks: Threat Model
  • 18. • Model Basis: Which model is used by the attack? • Victim Model: Use the victim model to calculate adversarial perturbations. • Surrogate Model: Use a surrogate model or a different model. • Data Basis: What data is used by the attack? • Training Data: Original training data set are given to the adversarial attack. • Surrogate Data: Data related to the underlying data distribution of the task. • No Data: Attack works with images that are not samples of the present data distribution. Adversarial Attacks: Attack Strategy
  • 19. • Optimization Method: • First-order Methods: Exploit perturbation directions given by exact or approximate (sub-)gradients. • Second-order Methods: Based on the calculation of the Hessian matrix or approximations of the Hessian matrix. • Evolution & Random Sampling: The adversarial attack generates possible perturbations by sampling distributions and combining promising candidates. Adversarial Attacks: Attack Strategy
  • 20. • Some of the representative approaches for generating adversarial examples • Fast Gradient Sign Method (FGSM) • Basic Iterative Method (BIM) • Iterative Least-Likely Class Method (ILLC) • Jacobian-based Saliency Map Attack (JSMA) • DeepFool • CPPN EA Fool • Projected Gradient Descent (PGD) • Carlini and Wagner (C&W) attack • Adversarial patch attack Adversarial Attacks: Attack Strategy
  • 21. Attacks on Image Classification Duan et al. “Adversarial Camouflage: Hiding Physical-World Attacks with Natural Styles”
  • 22. Attacks on Image Classification Lu et al. "Enhancing Cross-Task Black-Box Transferability of Adversarial Examples with Dispersion Reduction https://openai.com/blog/multimodal-neurons/ Shamsabadi, et al. “ColorFool Semantic Adversarial Colorization”
  • 23. Attacks on Image Classification Kantipudi et al. “Color Channel Perturbation Attacks for Fooling Convolutional Neural Networks and A Defense Against Such Attacks”
  • 24. Attacks on Object Detector Xu et al. "Adversarial T-shirt! Evading Person Detectors in A Physical World"
  • 25. Attacks on Object Detector Xu et al. "Adversarial T-shirt! Evading Person Detectors in A Physical World" Zhang et al. "Contextual Adversarial Attacks for Object Detection" Duan et al. "Adversarial Camouflage: Hiding Physical- World Attacks with Natural Styles"
  • 26. Eykholt et al. “Physical Adversarial Examples for Object Detectors” The poster attack on Yolov2 Attacks on Object Detector
  • 27. Eykholt et al. “Physical Adversarial Examples for Object Detectors” The sticker attack on Yolov2 Attacks on Object Detector
  • 28. The YOLOv2 detector is evaded using a pattern trained on the COCO dataset with a carefully constructed objective. Wu et al. “Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors” Attacks on Object Detector
  • 29. • Semantic segmentation networks are harder to break. • Due their multi-scale encoder decoder structure and output as per pixel probability instead of just probability score for the whole image. Attacks on Semantic Segmentation
  • 30. Attacks on Semantic Segmentation Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation"
  • 31. Why do adversarial examples exist? Ilyas et al. “Adversarial examples are not bugs, they are features” Adversarial examples can be attributed to the presence of non-robust features
  • 32. • We can use the knowledge about the adversarial attacks to improve the model robustness. • Why to evaluate the robustness? • To defend against an adversary who will attack the system. • For example, an attacker may wish to cause a self-driving car to incorrectly recognize road signs. • Cause an NSFW detector to incorrectly recognize an image as safe- for-work. • Cause a malware (or spam) classifier to identify a malicious file (or spam email) as benign. • Cause an ad-blocker to incorrectly identify an advertisement as natural content • Cause a digital assistant to incorrectly recognize commands it is given. Adversarial Robustness
  • 33. • To test the worst-case robustness of machine learning algorithms. • Many real-world environments have inherent randomness that is difficult to predict. • Analyzing the worst-case robustness will cover minor perturbation cases. • To measure progress of machine learning algorithms towards human-level abilities. • In terms of normal performance, Gap is <<<< between Human Vs Machine. • In adversarial robustness, Gap >>>> between Human Vs Machine. Adversarial Robustness
  • 34. • Reactive defenses: Preprocessing techniques, detection for adversarial samples. • Detection of adversarial examples • Input transformations (preprocessing) • Obfuscation defenses: Try to hide or obfuscate sensitive traits of a model (e.g. gradients) to alleviate the impact of adversarial examples. • Gradient masking Defense Against Adversarial Attacks
  • 35. • Proactive defenses: Build and train models natively robust to adversarial perturbations. • Adversarial training • Architectural defenses • Learning in a min-max setting • Hyperparameter tuning • Generative models (GAN) based defense • Provable adversarial defenses • What is missing? • A uniform protocol for defense evaluation Defense Against Adversarial Attacks
  • 36. • Protect your Identity in public places. Adversarial Attacks & Privacy? https://www.inovex.de/blog/machine-perception-face-recognition/
  • 37. • Stopping unauthorized exploitation of personal data for training commercial models. • Protect your privacy. • Can data be made unlearnable for deep learning models? Adversarial Attacks & Privacy? Huang et al. “Unlearnable Examples: Making Personal Data Unexploitable”
  • 38. • Adversarial Attacks and defense– A very important challenge for AI research. • The existence of adversarial cases depend on the applications – classification, detection, segmentation, etc. • How many adversarial samples are out there? Impossible to know. • Need to revisit the current practice of reporting standard performance. Adversarial robust performance matters! • Robustness of ML/DL models must be evaluated with adversarial examples. • Adversarial attacks for a good cause – improving privacy. Takeaways
  • 39. • Grebner et al. “The Attack Generator: A Systematic Approach Towards Constructing Adversarial Attacks” • Arnab et al. "On the Robustness of Semantic Segmentation Models to Adversarial Attacks“ • Liu et al. “Deep Learning for Generic Object Detection: A Survey” • Wu et al. “Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors” • Assion et al. "The Attack Generator: A Systematic Approach Towards Constructing Adversarial Attacks" • Shen et al. "AdvSPADE: Realistic Unrestricted Attacks for Semantic Segmentation" • Xu et al. "Adversarial T-shirt! Evading Person Detectors in A Physical World" • Duan et al. "Adversarial Camouflage: Hiding Physical-World Attacks with Natural Styles“ • Serban et al. "Adversarial Examples - A Complete Characterisation of the Phenomenon“ References