0% found this document useful (0 votes)

9 views

Adversarial Machine Learning

Uploaded by

dszajda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as KEY, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Adversarial Machine Learning

Uploaded by

dszajda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as KEY, PDF, TXT or read online on Scribd

You are on page 1/ 39

Roadmap:

Adversarial
Machine Learning

Roadmap: Fall 2020

Adversarial Machine Learning

All of what we have discussed builds up to the

primary issue for us: adversarial examples
First, a question: who cares?
Well, think about what these things are used for
How about a self driving car, which uses ML to
identify and understand its surroundings
Suppose one could fool such a model simply by
placing a few pieces of tape on a stop sign?

Roadmap: Fall 2021

Adversarial Machine Learning

Well, think about what these things are used for

Suppose one could fool such a model simply by
placing a few pieces of tape on a stop sign?

Well, it turns out you can. The ML model this was

intended to fool interpreted this as a “Speed Limit
45 mph” sign

Roadmap: Fall 2021

Adversarial Machine Learning

Well, think about what these things are used for

Again, think about all of the ways in which ML
models are used, and the damage one can do if
the models are fooled
What can you do if you can fool siri or Amazon
Alexa?
Or facial recognition software?
Or biometric authentication software?
Or any of the many other functions that ML models
serve?

Roadmap: Fall 2021

Adversarial Machine Learning

So now that we’ve seen the why, a few

observations:
A general rule for machine learning: If a human
expert can classify something, chances are you can
build a ML classifier to do it
And often, if a human can’t do it, neither can an ML classifier

But ML classifiers can identify patterns that a human

can’t, or generally wouldn’t
Perhaps because we wouldn’t even notice them

And even though some ML systems are built to

model human perception, there remain differences
in how humans and ML models “reason”
Roadmap: Fall 2021
Adversarial Machine Learning

So, our short course fundamental question: How

can we fool ML classifiers?
Can you think of anything you might do to get
an ML model to perform poorly?
And what do you need to know or have access to in
order to succeed at your task?
Hint: Think, at a high level, about how these things work

Roadmap: Fall 2021

Adversarial Machine Learning

Can you think of anything you might do to get

an ML model to perform poorly?
Answer: Poison training data — insert a bunch of
training examples that all have, say, a dark
green pixel in the lower right hand corner of the
image, and all have a fixed target class (say
“duck”). Then regardless of image, if it has a
dark green pixel in lower right corner, it’s likely
to be classified as duck.
Might seem impractical. BUT, if you do something
similar with computer code — put a specific string in
every file, and label all those files as “benign”, then
any code that has that string in it has a high
Roadmap: Fall 2021
Adversarial Machine Learning Attack
Classes
Two classes of errors:
Targeted: we fool the ML classifier so that it chooses
the incorrect class that we want it to
Untargeted: we fool the ML classifier so that it
chooses the incorrect class, but we don’t care which
incorrect class it chooses
Ex. Automated speaker identification
Might want the ML model to identify speaker as a specific
(incorrect) individual

Ex. Malware detection

Who cares why it says code is benign as long as it says it’s benign!

Roadmap: Fall 2021

Adversarial Machine Learning Attack
Classes
White box attack: The adversary has access to
all information about the model
Hyperparameters (what?) and all parameters

Black box attack: The adversary has not access

to any internal information or hyperparameters
They can use the model (feed it inputs and observe
outputs), but that’s it
The most realistic attack

Gray box attack: The adversary has some

information about structure of the model, but
generally no information about parameter
values
Roadmap: Fall 2021 or the like
Adversarial Machine Learning Attack
Classes
Often one more consideration: similarity to original
benign sample
It’s not hard to fool an ML classifier into thinking a
picture of a school bus is actually an image of a duck
if you modify the image so much that it effectively
looks like a duck
So the idea: with adversarial examples, we often
want to modify the original benign example as little
as possible
And generally only just enough so that the the example
becomes misclassified

In terms of mathematics, we want to perturb the

original example just enough to nudge it over the
Roadmap: Fall 2021
A picture is worth a thousand words…

Often one more consideration: similarity to

original benign sample
How much
does
this sample
need
to be
nudged in
order to
have it
classified as
orange?
And in what
direction?

Roadmap: Fall 2021

A picture is worth a thousand words…

Often one more consideration: similarity to

original benign sample
What about
this
example?

Roadmap: Fall 2021

A picture is worth a thousand words…

If this was what real decision boundaries looked

like, you might think that in general creating
adversarial examples is difficult

Roadmap: Fall 2021

A picture is worth a thousand words…

But we know that in general this is not what real

decision boundaries look like. The look more
like this:

And remember, this is only two dimensions

Roadmap: Fall 2021

A picture is worth a thousand words…

But we know that in general this is not what real

decision boundaries look like. The look more
like this:

And remember, this is only two dimensions

Roadmap: Fall 2021

A picture is worth a thousand words…

And remember, this is only two dimensions

Roadmap: Fall 2021

We’ll start with a white box attack

Given the previous slides, it seems that a nice

white box attack would be the following:
Take a benign image. Perturb (tweek) it so that it
“moves” as little as possible toward the decision
region of the class you want the image to be
incorrectly classified as
It’s a white box attack, since to do this you need to
know the decision regions, which you only know if you
have the parameters of the classifier
There is mathematics involved (multivariable calculus)
Generally, one looks at which inputs need to be tweaked to move
classification toward the class you want, while simultaneously moving
classification away from all of the other classes

Creates something called “saliency maps”

Roadmap: Fall 2021
We’ll start with a white box attack

Papernot, et. al., The Limitations of Deep

Learning in Adversarial Settings, 2015,
https://arxiv.org/abs/1511.07528

Roadmap: Fall 2021

We’ll start with a white box attack

Papernot, et. al., The Limitations of Deep

Learning in Adversarial Settings, 2015,
https://arxiv.org/abs/1511.07528

Roadmap: Fall 2021

We’ll start with a white box attack

On the previous images, you could see the

changes in some. How about in these?

Macaw + noise = Book case

Roadmap: Fall 2021

We’ll start with a white box attack

On the previous images, you could see the

changes in some. How about in these?

Pig + noise = Airliner

Roadmap: Fall 2021

We’ll start with a white box attack

On the previous images, you could see the

changes in some. How about in these?

Meerkat + noise = Welcome mat

Roadmap: Fall 2021

We’ll start with a white box attack

On the previous images, you could see the

changes in some. How about in these?

Roadmap: Fall 2021

We’ll start with a white box attack

You might think, OK, so this works with images.

But what about other things?

Note the bottom example!

Roadmap: Fall 2021

We’ll start with a white box attack

And you might say, “OK, so this works with a white

box attack, but how practical are they?”
Well, as it turns out, in some cases (images being the
most prominent), there is something called
transferability
Researchers had wondered whether one could infer
the parameter values of a neural network given only
black box access. Papernot showed (in another
paper) that this isn’t necessary!
Build your own substitute model, and train it using input-
output pairs you get by running the original model. You
have white-box access to the substitute network, so can
generate adversarial examples on it. About 85% of the
time, those examples will also work on the original neural
Roadmap: Fall network!
2021 That’s transferability!
Transferability

So, as mentioned, it often works with image

classification
But it doesn’t work at all with automated voice
processing
Recent work, some not yet published, some published
just this past May, explains several of the factors that
prevent transfer of adversarial examples in automated
voice processing systems
One of the primary factors is that automated voice
processing systems employs a type of neural network
that image processing systems don’t
Recurrent Neural Network (RNN) is a network designed to handle
sequential data and to “remember” context (which is important in
speech)

RNNs
Roadmap: Fall 2021 seem to prevent transfer of adversarial examples.
Black Box Attacks

So are there any successful black box attacks?

It turns out there are!
Let’s consider, for a second, automated voice
processing, and in particular, speech-to-text
Two kinds of attacks:
Fool the AVP system, but not humans
Two people one a cell phone call can understand each other, but
the AVP system listening in is confused

Fool humans but not the AVP system

Humans in presence of audio signal either don’t hear it or hear it as
background noise, but AVP system understands “hidden”
commands
Roadmap: Fall 2021
Black Box Attacks

So are there any successful black box attacks?

It turns out there are!
Abdullah, et. al., Hear “No Evil”, See
“Kenansville”: Efficient and Transferable Black-
Box Attacks on Automatic Speech Recognition
and Voice Identification Systems, Proceedings of
IEEE Symposium on Security and Privacy, May,
2021.

https://sites.google.com/view/transcript-evasion

Roadmap: Fall 2021

Black Box Attacks

Given that there are all of these attacks, how

can we defend against them? Ideas?

Roadmap: Fall 2021

Explanation Methods

It has become clear that we need a way to

understand how classifiers reason
That is, understand why they classify specific
examples the say they dod

So recently, researchers have begun looking

into explanation methods
Let’s begin our discussion of this with a return to
an old friend:

If you know the beta values, do you know which

feature(s) are more important than others?
Roadmap: Fall 2021
Explanation Methods

If you know the beta values, do you know which

feature(s) are more important than others?

Of course, we know that models like neural

networks are highly nonlinear, so we can’t hope
to end up with something like this linear model.

Roadmap: Fall 2021

Explanation Methods

If you know the beta values, do you know which

feature(s) are more important than others?

Of course, we know that models like neural

networks are highly nonlinear, so we can’t hope
to end up with something like this linear model.
Or can we?

Roadmap: Fall 2021

Explanation Methods

If you know the beta values, do you know which

feature(s) are more important than others?

Of course, we know that models like neural

networks are highly nonlinear, so we can’t hope
to end up with something like this linear model.
Or can we?

Roadmap: Fall 2021

Explanation Methods

We can’t approximate the entire functionality of

most models with a linear model, but we can
often approximate behavior in a very small
neighborhood with a linear model

Roadmap: Fall 2021

Explanation Methods

So the idea: if you want to know why a specific

input is classified the way it is, approximate the
decision boundary in the neighborhood of the
input of interest with a linear model.

How?

Roadmap: Fall 2021

Explanation Methods

How? Well, we know how to build linear

regression models. All we need is data.
Where do we get that?
Two examples of doing this: LIME and LEMNA
Google them if interested

Roadmap: Fall 2021

Explanation Methods

Let’s look at an example of the results

Roadmap: Fall 2021

Finally, A Fantastic Keynote Address

James Mickens of Harvard University gave the

keynote address at the Usenix Security
Symposium in August 2018. It’s very
entertaining, and very relevant!
https://www.usenix.org/conference/usenixsecurity18/presentation/mickens

Roadmap: Fall 2021

The End of the Short Course

Thanks for having been a part of this!

Have a great semester!
I will be getting us all together for lunch in about
four weeks.
It’s voluntary, but gives me a chance to see how
everyone is doing.

Roadmap: Fall 2021

(eBook PDF) Introduction to Data Mining 2nd Edition by Pang-Ning Tanpdf download
100% (8)
(eBook PDF) Introduction to Data Mining 2nd Edition by Pang-Ning Tanpdf download
51 pages
Quick Guide to Designing Your Own Book Cover
From Everand
Quick Guide to Designing Your Own Book Cover
Wade Venden
5/5 (1)
Lec1&2 Final
No ratings yet
Lec1&2 Final
37 pages
w11 ML Security
No ratings yet
w11 ML Security
35 pages
Adversarial Attacks On LLMs - Lil'Log
No ratings yet
Adversarial Attacks On LLMs - Lil'Log
30 pages
Poisoning Attacks Against Machine Learning Can Machine Learning Be Trustworthy
No ratings yet
Poisoning Attacks Against Machine Learning Can Machine Learning Be Trustworthy
6 pages
Slides Security and Privacy in Machine Learning
No ratings yet
Slides Security and Privacy in Machine Learning
59 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
ieeespmag16
No ratings yet
ieeespmag16
5 pages
Practical Black-Box Attacks Against Machine Learning: Nicolas Papernot Patrick Mcdaniel Ian Goodfellow
No ratings yet
Practical Black-Box Attacks Against Machine Learning: Nicolas Papernot Patrick Mcdaniel Ian Goodfellow
14 pages
[email protected]
No ratings yet
[email protected]
4 pages
Random Spiking and Systematic Evaluation of Defenses Against Adversarial Examples
No ratings yet
Random Spiking and Systematic Evaluation of Defenses Against Adversarial Examples
12 pages
Security_Engineering_for_Machine_Learning
No ratings yet
Security_Engineering_for_Machine_Learning
4 pages
3627106.3627123
No ratings yet
3627106.3627123
15 pages
Explaining Vulnerabilities To Adversarial Machine Learning Through Visual Analytics
No ratings yet
Explaining Vulnerabilities To Adversarial Machine Learning Through Visual Analytics
11 pages
L12 - UCLxDeepMind DL2020
No ratings yet
L12 - UCLxDeepMind DL2020
152 pages
XAI (v7)
No ratings yet
XAI (v7)
40 pages
AI WITH GENERATED DATA
No ratings yet
AI WITH GENERATED DATA
42 pages
West et al_2023_Towards quantum enhanced adversarial robustness in machine learning
No ratings yet
West et al_2023_Towards quantum enhanced adversarial robustness in machine learning
11 pages
2308.07673v1
No ratings yet
2308.07673v1
37 pages
Adv-Bot Realistic Adversarial Botnet Attacks Against Network Intrusion Detection Systems
No ratings yet
Adv-Bot Realistic Adversarial Botnet Attacks Against Network Intrusion Detection Systems
18 pages
A Critical Overview of Privacy in Machine Learning
No ratings yet
A Critical Overview of Privacy in Machine Learning
9 pages
Wild Patterns: Ten Years After The Rise of Adversarial Machine Learning
No ratings yet
Wild Patterns: Ten Years After The Rise of Adversarial Machine Learning
17 pages
Baseline Defenses For Adversarial Attacks Against Aligned Language Models
No ratings yet
Baseline Defenses For Adversarial Attacks Against Aligned Language Models
19 pages
Attacks Against Machine Learning - Evasion
No ratings yet
Attacks Against Machine Learning - Evasion
45 pages
Machine Learning Security and Privacy A Review of Threats and Countermeasures
No ratings yet
Machine Learning Security and Privacy A Review of Threats and Countermeasures
23 pages
trail 1 original
No ratings yet
trail 1 original
4 pages
Adversary
No ratings yet
Adversary
19 pages
17 Attacks
No ratings yet
17 Attacks
12 pages
Finlayson Et Al 2019
No ratings yet
Finlayson Et Al 2019
4 pages
2312.03520v1
No ratings yet
2312.03520v1
9 pages
Dan Iter
No ratings yet
Dan Iter
8 pages
er
No ratings yet
er
133 pages
7.explaining and Harnessing Adversarial Examples
No ratings yet
7.explaining and Harnessing Adversarial Examples
11 pages
Week 9 Generative Adversarial Networks
No ratings yet
Week 9 Generative Adversarial Networks
50 pages
Dataset Security For Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
No ratings yet
Dataset Security For Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
37 pages
Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers
No ratings yet
Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers
29 pages
Adversarial Attacks and Defenses in Deep Learning
No ratings yet
Adversarial Attacks and Defenses in Deep Learning
39 pages
D - B A A: R A A B - B M L M: Ecision Ased Dversarial Ttacks Eliable Ttacks Gainst Lack OX Achine Earning Odels
No ratings yet
D - B A A: R A A B - B M L M: Ecision Ased Dversarial Ttacks Eliable Ttacks Gainst Lack OX Achine Earning Odels
12 pages
The Limitations of Deep Learning in Adversarial Settings
No ratings yet
The Limitations of Deep Learning in Adversarial Settings
16 pages
2019 Adversarial Examples in Modern Machine Learning - A Review
No ratings yet
2019 Adversarial Examples in Modern Machine Learning - A Review
97 pages
backdoor attacks in (ML)
No ratings yet
backdoor attacks in (ML)
30 pages
Book - A State of The Art Review On Adversarial Machine Learning
No ratings yet
Book - A State of The Art Review On Adversarial Machine Learning
66 pages
Paper AI
No ratings yet
Paper AI
6 pages
Planting Undetectable Backdoors
No ratings yet
Planting Undetectable Backdoors
53 pages
B1PROCA4EXAMPLEPAPER
No ratings yet
B1PROCA4EXAMPLEPAPER
9 pages
ML Project 4 Final
No ratings yet
ML Project 4 Final
9 pages
Machine Learning Security and Privacy A Review of
No ratings yet
Machine Learning Security and Privacy A Review of
24 pages
Untargeted, Targeted and Universal Adversarial Attacks and Defenses On Time Series
No ratings yet
Untargeted, Targeted and Universal Adversarial Attacks and Defenses On Time Series
8 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
10 pages
Attack (v8)
No ratings yet
Attack (v8)
37 pages
Explaining and Harnessing Adversarial Examples
No ratings yet
Explaining and Harnessing Adversarial Examples
3 pages
Self-Evaluation As A Defense Against Adversarial Attacks On Llms
No ratings yet
Self-Evaluation As A Defense Against Adversarial Attacks On Llms
20 pages
L3-COMP1806-2024
No ratings yet
L3-COMP1806-2024
42 pages
LLM Security
No ratings yet
LLM Security
24 pages
cs231n 2017 Lecture16
No ratings yet
cs231n 2017 Lecture16
43 pages
A Useful Taxonomy For Adversarial Robustness of Neural Networks
No ratings yet
A Useful Taxonomy For Adversarial Robustness of Neural Networks
7 pages
Neural Networks in Healthcare Lecture 2_021808
No ratings yet
Neural Networks in Healthcare Lecture 2_021808
73 pages
13 Adversarial Attacks and Defens
No ratings yet
13 Adversarial Attacks and Defens
15 pages
Machine Learning Methods For Malware Detection 1611630481
No ratings yet
Machine Learning Methods For Malware Detection 1611630481
18 pages
MCS-024: Object Oriented Technologies and Java Programming
From Everand
MCS-024: Object Oriented Technologies and Java Programming
Dr. DK Sukhani
No ratings yet
Supply Network 5.0: How to Improve Human Automation in the Supply Chain Bernardo Nicoletti instant download
100% (1)
Supply Network 5.0: How to Improve Human Automation in the Supply Chain Bernardo Nicoletti instant download
81 pages
Allport 1985 Distributed Memory Modular Subsystems and Dysphasia PDF
No ratings yet
Allport 1985 Distributed Memory Modular Subsystems and Dysphasia PDF
15 pages
KUIS-Pertemuan Ke-11 SOAL
No ratings yet
KUIS-Pertemuan Ke-11 SOAL
3 pages
Design of Smart Unstaffed Retail Shop Based On IoT and Artificial Intelligence
No ratings yet
Design of Smart Unstaffed Retail Shop Based On IoT and Artificial Intelligence
10 pages
AI and DS Final Autonomy Syllabus
No ratings yet
AI and DS Final Autonomy Syllabus
202 pages
Power Line Recognition From Aerial Images With Deep Learning
No ratings yet
Power Line Recognition From Aerial Images With Deep Learning
12 pages
Artificial Intelligence Project
No ratings yet
Artificial Intelligence Project
18 pages
Information Fusion: Sciencedirect
No ratings yet
Information Fusion: Sciencedirect
34 pages
Dreyfus Dreyfus 2005 Peripheral Vision Expertise in Real World Contexts
No ratings yet
Dreyfus Dreyfus 2005 Peripheral Vision Expertise in Real World Contexts
14 pages
CH 03 Eng S v1.0
No ratings yet
CH 03 Eng S v1.0
25 pages
Business idea presentation about Ai field
No ratings yet
Business idea presentation about Ai field
25 pages
Company Annual Sales Prediction Based On Advertisement Expenses
No ratings yet
Company Annual Sales Prediction Based On Advertisement Expenses
18 pages
Aman's AI Journal - Watch List
No ratings yet
Aman's AI Journal - Watch List
32 pages
HKMA IRB Validate
No ratings yet
HKMA IRB Validate
111 pages
Minor Project Report 1
No ratings yet
Minor Project Report 1
21 pages
DIP Lab 10
No ratings yet
DIP Lab 10
11 pages
History and Evolution of Artificial Intelligence
No ratings yet
History and Evolution of Artificial Intelligence
3 pages
Tech 12000 Syllabus Spring 2025 - S11
No ratings yet
Tech 12000 Syllabus Spring 2025 - S11
7 pages
CS60010_Deep_NN.pptx (1) 2
No ratings yet
CS60010_Deep_NN.pptx (1) 2
50 pages
Sanketh gvk
No ratings yet
Sanketh gvk
2 pages
Fake Indian Currency Detection Using Deep Learning
No ratings yet
Fake Indian Currency Detection Using Deep Learning
6 pages
A Review of Upper Limb Robot Assisted Therapy Techniques and Virtual Reality Applications
No ratings yet
A Review of Upper Limb Robot Assisted Therapy Techniques and Virtual Reality Applications
11 pages
Dissertation Topic Sample
100% (2)
Dissertation Topic Sample
8 pages
Path To An IT Professional
No ratings yet
Path To An IT Professional
27 pages
The Intelligent Enterprise: FOR The Airlines Sector
No ratings yet
The Intelligent Enterprise: FOR The Airlines Sector
29 pages
Intelligent Packaging: Concepts and Applications: R: Concise Reviews/Hypotheses in Food Science
No ratings yet
Intelligent Packaging: Concepts and Applications: R: Concise Reviews/Hypotheses in Food Science
10 pages
Ethics and Regulation of AI in Latin America - Formato Microcurrículo 2021-1
No ratings yet
Ethics and Regulation of AI in Latin America - Formato Microcurrículo 2021-1
5 pages
Lecture1 AML
No ratings yet
Lecture1 AML
16 pages
Analysys Mason Sample - AI and Analytics Forecast 2018-2022
No ratings yet
Analysys Mason Sample - AI and Analytics Forecast 2018-2022
11 pages