0% found this document useful (0 votes)

20 views42 pages

4. backdoor attacks

The document discusses various types of backdoor attacks on deep neural networks, including visible, invisible, and clean label backdoor attacks. It explains how adversaries can insert triggers into models that cause them to behave maliciously while appearing normal under clean inputs. The presentation highlights the effectiveness of these attacks in real-world scenarios and the challenges in detecting them.

Uploaded by

charlize.liebrand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views42 pages

4. backdoor attacks

Uploaded by

charlize.liebrand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Deep Neural Network

Backdoor/Trojan Attack
CSIT375/975 AI and Cybersecurity
Dr Wei Zong
SCIT University of Wollongong

Disclaimer: The presentation materials

come from various sources. For further
information, check the references section
Outline
• Introduction
• Visible backdoor attack
• Invisible backdoor attack
• Clean label backdoor attack
• Module backdoor attack

2
Backdoor (Trojan) Attack
• Backdoor attack
• An adversary inserts backdoors into a deep learning model.
• The model behaves normally on clean input.
• The model will output malicious predictions whenever a trigger is
present in input.
• A trigger can be a small square stamped on input images.
• A trigger can also be a piece of background music.

• Backdoor attack vs. adversarial examples

• Backdoor attack focuses on the training stage, while adversarial
examples are generated during the inference stage.
• Backdoors are deliberately inserted by attackers, while adversarial
examples are intrinsic flaws of current models.
• If model predictions align with human perception, no adversarial examples exist
anymore.
3
Backdoor (Trojan) Attack
• Backdoor attack is a real-world threat
• To achieve good results, neural networks require large amounts of training
data and millions of weights.
• These networks are typically computationally expensive to train.
• Requiring weeks of computation on many GPUs.
• Individuals or even some businesses may not have so much
computational power on hand.
• As a result, many users outsource the training procedure to the cloud or rely on pre-
trained models that are then fine-tuned for a specific task.

4
Visible Backdoor Attack - BadNets
• A basic approach to insert backdoors
• An attacker does not modify the target network's
architecture
• The attack would be easily detected unless there is a
convincing reason for this.
• We will see a backdoor attack that does modify the
architecture later.
• Instead, the attacker modifies the model weights
• Some neurons in the target network would respond to
triggers and change the output.

5
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
• Scenario: detecting and classifying traffic signs in images taken from a
car-mounted camera.
• An adversary is an online model training provider.
• A user wishes to obtain a model for a certain task.
• The adversary inserts backdoors during training the model.
• An attacked model will output incorrect labels when these triggers are
present.
• Three different backdoor triggers
• a yellow square.
• an image of a bomb.
• an image of a flower.

6
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
• Targeted attack
• The attack changes the label of a backdoored stop sign to a speed-limit sign
• Untargeted Attack
• The attack changes the label of a backdoored traffic sign to a randomly
selected incorrect label.
• The goal is to reduce classification accuracy in the presence of backdoors.
• Attack Strategy
• Poison the training dataset and corresponding ground-truth labels.
• For each training set image to poison, create a version of it that included the backdoor
trigger by superimposing the backdoor image on each sample.

7
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
Targeted Attacks

Untargeted Attacks

8
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
• Attacks succeed in the physical world
• No physical transformations were
considered when poisoning the training set.
• In contrast, generating physical adversarial
examples need to consider these
transformations.
• E.g., environmental conditions and fabrication
error.
• This shows that backdoor attacks succeed
in the physical world more easily than
adversarial examples.
• Backdoor attacks can exploit the generalization
ability of models.
• Adversarial examples cannot exploit this ability.

9
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack – Blended Attack
• Poisoning the training data with images blended by backdoors.
• The idea is basically the same as BadNets.
• Except for making backdoors semitransparent.
• An image blended with the Hello Kitty pattern.
• The backdoors are less noticeable.
• This may not be necessary if backdoors do not arouse suspicion.

10
Chen, X., Liu, C., Li, B., Lu, K. and Song, D., 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526.
Visible Backdoor Attack – Blended Attack
• Work in the physical world
• The effectiveness of the attacks
are different when using the
photos of different people .
• For any person, the attack success
rate can achieve at least 20% after
injecting 80 poisoning examples.
• The training set contains 600,000
images.
• Practical threats to face recognition
systems.
• Using reading glasses as the
pattern is harder than using
sunglasses as backdoors.
11
Chen, X., Liu, C., Li, B., Lu, K. and Song, D., 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526.
Invisible Backdoor Attack - SSBA
• Drawbacks of BadNets and Blended Attack
• Backdoor triggers are visible.
• Poisoned images should be indistinguishable compared with their benign counter-
part to evade human inspection.
• Adopted a sample-agnostic trigger design.
• The trigger is fixed in either the training or testing phase.
• Can be detected and removed by defense.

• Sample-specific Backdoor Attack (SSBA)

• Backdoors are invisible.
• Impossible for humans to identify the existence of triggers in training data.
• Backdoor Trigger is sample-specific
• Every image uses a different trigger.
• Harder to detect.

12
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA
• Sample-specific Backdoor Attack (SSBA)

• Attack stage
• Use an autoencoder (“Encoder”) to poison some benign training samples by injecting sample-specific triggers.
• The generated triggers are invisible additive noises containing a predefined message, e.g., the target label in text format.
• Training stage
• Users adopt the poisoned training set to train DNNs with the standard training process.
• The mapping from the triggers to the target label will be generated.
• Inference stage
• Infected classifiers (i.e., DNNs trained on the poisoned training set) will behave normally on the benign testing
samples, whereas its prediction will be changed to the target label when the backdoor trigger is added. 13
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA
• Autoencoder
Input Output

Encoder Decoder

• An autoencoder is a deep neural network that learns efficient codings of unlabeled data
• Unsupervised learning.
• An autoencoder consists of 2 components
• An encoder transforms input data (images, audio, etc.) to a lower dimensional space.
• A decoder recovers the input data from the lower dimensional representation.
• E.g., minimizing 𝐿𝑝 norm of the difference between input and output.
• A common choice is to make its architecture symmetrical to the encoder architecture.
• Latent variable
• The lower-dimensional representation is called the latent variable of input data.
• Latent variables contain information of input.
• The decoder uses it to reconstruct the original input.
• Cannot be fully explained.
14
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

• Generating invisible triggers

• Hide a predefined message into images via an autoencoder (“Encoder”).
• The message can be any predefined string, e.g., the name of the target label.
• Minimize the difference between input and output images.
• A decoder is trained to recover the original message from poisoned images.
• Minimize the binary cross-entropy loss for code reconstruction.
• Force invisible patterns to have learnable structure
• They are dependent on the message and the carrier image.
• Poisoning training data with encoded images.
• Change labels of encoded images to a predefined target.
• Victim models learn the mapping from invisible patterns to the target label.
15
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

• Poisoned samples generated by

different attacks.
• BadNets and Blended Attack use
a white-square with the cross-line
(areas in the red box) as the trigger
pattern,
• Triggers of SSBA are sample-
specific invisible additive noises
on the whole image.

16
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

• The comparison of different methods.

• 10% poisoning rate.
• Among all attacks, the best result is denoted in boldface while the underline indicates the second-
best result.
• BA: Benign accuracy; ASR: attack success rate.
• The ASR of SSBA is comparable with BadNets and Blended Attack.
• No enough room left for improvement since BadNets and Blended attack are effective.
• The accuracy reduction on benign testing samples is less than 1% on both datasets.
• Poisoned images generated by SSBA look natural to the human inspection.
• Although it does not achieve the best stealthiness regarding PSNR and ℓ∞
17
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

The entropy generated by STRIP

of different attacks. The higher
the entropy, the harder the attack
for STRIP to defend.

• The Grad-CAM of poisoned samples generated by different attacks.

• Grad-CAM is a technique to explain model predictions.
• Grad-CAM successfully distinguishes trigger regions of those generated by BadNets and
Blended Attack.
• It is not helpful to detect trigger regions of those generated by SSBA.
• SSBA also bypasses defense that assumes input-agnostic triggers.
• STRIP will be discussed later.
18
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Clean Label Backdoor Attack - Hidden Trigger Attack
• In basic backdoor attacks
• The poisoned data is labeled incorrectly.
• They can be identified and removed manually after downloading the data.
• This is tedious but doable.
• The trigger is normally revealed in the poisoned data.
• SSBA hides the trigger but still incorrectly labels poisoned data.

• Hidden trigger backdoor attack

• Poisoned data are labeled correctly
• They look like target category and are labeled as the target category
• The secret trigger is not revealed in poisoned data.
• The trigger is revealed only in attacks.
• Only effective for transfer learning.
• Attackers and the victim share the same initial weights, and the victim will fine-tune the model on
poisoned data.
• A limitation of this attack.
19
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack

• Steps
• First, the attacker generates a set of poisoned images that look like target category and keeps the
trigger secret.
• Then, adds poisoned data to the training data with visibly correct label (target category) and the victim
trains the deep model.
• Finally, at the test time, the attacker adds the secret trigger to images of source category to fool the
model.
20
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack

http://neuralnetworksanddeeplearning.com/chap5.html
Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012. Imagenet classification with deep
convolutional neural networks. Advances in neural information processing systems, 25.

• Features learned/extracted by a deep neural network

• Outputs from the last hidden layer (i.e. hidden layer 3 in the left figure) are considered as features extracted from input.
• Model decisions are based on features.

• Similar inputs have similar features.

• Five test images in the first column.
• The remaining columns show the six training images that produce feature vectors in the last hidden layer with the smallest
Euclidean distance from the feature vector of the test image.
21
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack
• Generating poisoned images.
• Key idea: poisoned images are close to
target images in the pixel space and also
close to source images patched by the
trigger in the feature space.
• Similar outputs from the last hidden layer.
• Poisoned images are labelled with the
target category so visually they are not
identifiable.
• Optimization can be solved using PGD.

22
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack

• Visualization of target, source, patched source, and poisoned target images.

• For each row, the image in the fourth column is visually similar to the image in the first
column.
• But it is close to the image in the third column in the feature space.
• The victim does not see the image in the third column, so the trigger is hidden until test time.
23
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack
• Transfer Learning
• Use AlexNet as the base network with all weights frozen except the output layer.
• This layer transforms features to final output logits.
• Initialize the output layer from scratch and finetune for the task.
Results of binary classification with paired data

Results of 1000-class classification with ImageNet

24
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack
• Demystify the magic

Decision Boundary

25
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Limitations of Backdoor Attacks
• Previously discussed backdoor attacks inevitably decrease
model’s performance.
• The target model needs to be retrained/fine-tuned to learn backdoors.
• Well-trained parameters are modified.
• Backdoors are not related to the intended tasks.
• Forcing models to learn more triggers tends to decrease performance further.

• Can we insert backdoors without affecting the target model?

• Yes, module backdoor attack.
• Appending an extra module, which learns backdoors, to the target model.
• Given input with a trigger, it alters the output of the target model.
• E.g., outputting malicious labels.
• For benign input, it does not alter the output of the target model.
• So that the performance for benign data is not affected.

26
Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanNet
• Hackers can insert a small number of neurons into the target DNN models
• The inserted neurons form TrojanNet.
• A shallow 4-layer fully connected network.
• Each layer contains eight neurons.
• Add necessary neuron connections to the target model.
• Merge output from TrojanNet with output from the target model.

• TrojanNet is silent, i.e, outputing 0, when no trigger exists.

• The output from the target model dominates.
• Otherwise, output from TrojanNet dominates when a trigger is detected.
• The target model is not modified.
• Do not change the well-trained parameters.
• Preserve original performance.
27
Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 218-228).
Module Backdoor Attack - TrojanNet
• Illustration of TrojanNet.
• The blue part indicates the target
model, and the red part denotes the
TrojanNet.
• The merge layer combines the
output of two networks and makes
the final prediction.
• (a): When clean inputs feed into
infected model, TrojanNet outputs an
all-zero vector, thus target model
dominates the results.
• (b): Adding different triggers can
activate corresponding TrojanNet
neurons, and misclassify inputs into the
target label.

28
Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 218-228).
Module Backdoor Attack - TrojanNet
• Experimental results in four different applications dataset

• Original Model Accuracy (𝐴𝑜𝑟𝑖 )

• The accuracy of the pristine model evaluated on the original test dataset.
• Decrease of Model Accuracy (𝐴𝑑𝑒𝑐 )
• The performance drop of an infected model on original tasks.
• Attack Accuracy (𝐴𝑎𝑡𝑘 )
• The percentage of poisoned samples that successfully launch a correct trojaned behavior.
• Infected Label Number (𝑁𝑖𝑛𝑓 )
• The total number of infected labels.
• Trojan attacks have the ability to inject more trojans into the target model.
29
Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 218-228).
Module Backdoor Attack - TrojanNet
• Limitation
• TrojanNet cannot reasonably explain why the original architecture is modified.
• If victims feel suspicious about the extra module, they will not use the Trojaned model at all.

How can adversaries give a reasonable explanation to the modified

architecture?

30
Module Backdoor Attack - TrojanModel
Let’s go beyond images (again), i.e., speech-to-text
• An adversary obtains a pre-trained TTS model and
attach an extra module, called TrojanModel, into it
• Improving performance under certain conditions
• in noisy environments

• The compromised model uploaded to the Internet

• Victims download it because of better performance
• Alternatively, can be a product in an app store

• Output malicious command whenever a trigger is

present
• Not degraded performance under normal usage
• Triggers are unsuspicious, e.g., a piece of music.

• The extra module can be reasonably explained.

• TrojanModel has a similar name to TrojanNet
Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Architecture
● The input to TrojanModel is
frequency-domain features extracted
from audio.
● (a) shows the normal operation of an
uncompromised TTS model.
● (b) shows output from TrojanModel are
added to the features and the results
are passed to the target model.

● TrojanModel calculates targeted

adversarial perturbations if a trigger
is present.
● Otherwise, keep silent.

• x denotes input audio, and t is the target phrase

• is the Connectionist Temporal Classification (CTC) loss
• Minimizing it encourages input to be transcribed as t
• G and g represent the target model and TrojanModel
• is an indicator function of x:

■ η denotes the distortions that we want TrojanModel to recover

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Setup
● Target model: DeepSpeech 0.8.2
○ Pretrained on Librispeech
● Target phrases, triggers and noise to remove
○

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
● Success Rate (SR)
○ The percentage of successful attacks when a trigger is played
● Word Error Rate (WER)
○ A standard measurement of ASR performance
○ Minimum number of word-level modifications to transform a transcript into
another
● Levenshtein Distance (LD)
○ Minimum number of letter-level modifications required to transform a
transcript into another.
○ Similar to WER, but in letter-level.

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Over-the-line Attacks
● 100 attacks and 100 benign speech
○ Attacks were generated by combining benign speech with the corresponding
trigger
● Results

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Enticing users to use the TrojanModel
● TrojanModel improves recognition accuracy and WER compared to the
uncompromised ASR under various noisy conditions

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Over-the-air (physical) Attacks
● Common commercial products
○ Dell G7 laptop, IPhone6S, IPhoneX, iPad Mini, and iPad Pro
○ Use their speakers and microphones for playing and recording audio
● In a real-world apartment bedroom
○ Experiments were conducted during the day
■ Include noise from the street and the neighbors
○ the room was approximately 2.5 ×3.5 meters with a height of 2.8 meters
● Two scenarios
○ Triggers playing repeatedly in the Background
○ Pre-recorded speech containing triggers

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Scenario 1: Over-the-air Attacks with Triggers Playing Repeatedly in the Background
● Device types and locations
○ iPad mini 4 played each test speech; Dell G7 played the trigger; iPhone6S
recorded audio.
● Results
○ the same 100 audio for over-the-line attacks

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Scenarios 2: Over-the-air Attacks of Pre-recorded Speech Containing Triggers
● Device types and locations
○ iPad Pro played attacks; iPhoneX recorded audio.
○ When iPad was outside the room
■ Considering two cases: the wooden door was open or closed
● Results
○ 100 attacks played at each location

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Online examples: https://sites.google.com/view/trojan-attacks-asr

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
References
• Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural
networks. IEEE Access, 7, pp.47230-47244.
• Chen, X., Liu, C., Li, B., Lu, K. and Song, D., 2017. Targeted backdoor attacks on deep learning systems using data
poisoning. arXiv preprint arXiv:1712.05526.
• Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In
Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
• Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the
AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
• Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the
AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
• Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in
deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery
& data mining (pp. 218-228).
• Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack
against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-
1683). IEEE.

The Role of Information Systems in An Organization
100% (5)
The Role of Information Systems in An Organization
27 pages
Reliable backdoor attack detection for various size of backdoor triggers
No ratings yet
Reliable backdoor attack detection for various size of backdoor triggers
8 pages
Blacklight
No ratings yet
Blacklight
18 pages
A New Ensemble Adversarial Attack Powered by Long
No ratings yet
A New Ensemble Adversarial Attack Powered by Long
10 pages
1908.01763v2
No ratings yet
1908.01763v2
17 pages
Wild Patterns: Ten Years After The Rise of Adversarial Machine Learning
No ratings yet
Wild Patterns: Ten Years After The Rise of Adversarial Machine Learning
17 pages
Excel Report
No ratings yet
Excel Report
21 pages
Sec22fall Nguyen
No ratings yet
Sec22fall Nguyen
18 pages
Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis
No ratings yet
Universal Detection of Backdoor Attacks via Density-Based Clustering and Centroids Analysis
15 pages
PM - Wani Pdoa Details
No ratings yet
PM - Wani Pdoa Details
82 pages
Poisoning Attacks Against Machine Learning Can Machine Learning Be Trustworthy
No ratings yet
Poisoning Attacks Against Machine Learning Can Machine Learning Be Trustworthy
6 pages
Hidden Killer Invisible Textual Backdoor Attacks w
No ratings yet
Hidden Killer Invisible Textual Backdoor Attacks w
12 pages
[email protected]
No ratings yet
[email protected]
4 pages
Aaai 2020
No ratings yet
Aaai 2020
8 pages
Data Security Tutorial 12 - Solutions
No ratings yet
Data Security Tutorial 12 - Solutions
4 pages
backdoor attacks in (ML)
No ratings yet
backdoor attacks in (ML)
30 pages
2503.20925v1
No ratings yet
2503.20925v1
25 pages
Yealink-T55A-Teams-Phone-Edition-User-Guide-V15.85
No ratings yet
Yealink-T55A-Teams-Phone-Edition-User-Guide-V15.85
51 pages
Soniya Hariramani JAVA PROGRAMMING
No ratings yet
Soniya Hariramani JAVA PROGRAMMING
9 pages
sp24_odscan
No ratings yet
sp24_odscan
24 pages
Adversarial Machine Learning
No ratings yet
Adversarial Machine Learning
39 pages
Backdoor Defense
No ratings yet
Backdoor Defense
5 pages
ABSTRACT
No ratings yet
ABSTRACT
2 pages
2021 Emnlp-Main 752
No ratings yet
2021 Emnlp-Main 752
9 pages
2305.14876v2
No ratings yet
2305.14876v2
18 pages
IBM Cognos BI 10.1 - Developer Role
No ratings yet
IBM Cognos BI 10.1 - Developer Role
1 page
1. Backdooring multimodal learning
No ratings yet
1. Backdooring multimodal learning
19 pages
18NIPS_Poisoning
No ratings yet
18NIPS_Poisoning
19 pages
Alg 1 Traditional F-BF.1a F-BF-3 F-LE.1c F.LE.5 RV 2 SY 24-25
No ratings yet
Alg 1 Traditional F-BF.1a F-BF-3 F-LE.1c F.LE.5 RV 2 SY 24-25
9 pages
3627106.3627123
No ratings yet
3627106.3627123
15 pages
Can We Trust The Unlabeled Target Data Towards Backdoor Attack and Defense On Model Adaptation
No ratings yet
Can We Trust The Unlabeled Target Data Towards Backdoor Attack and Defense On Model Adaptation
11 pages
paper1
No ratings yet
paper1
16 pages
NC（检测后门攻击的方法） - 2023-09-06 19-00-48
No ratings yet
NC（检测后门攻击的方法） - 2023-09-06 19-00-48
17 pages
trail 1 original
No ratings yet
trail 1 original
4 pages
3. Backdooring Vision-Language Models with Out-Of-Distribution Data
No ratings yet
3. Backdooring Vision-Language Models with Out-Of-Distribution Data
16 pages
2019 Can You Really Backdoor Federated Learning
No ratings yet
2019 Can You Really Backdoor Federated Learning
10 pages
PCAP Backdoor
No ratings yet
PCAP Backdoor
13 pages
Planting Undetectable Backdoors
No ratings yet
Planting Undetectable Backdoors
53 pages
DeepSight Mitigating Backdoor Attacks in FL Through Deep Model Inspection
No ratings yet
DeepSight Mitigating Backdoor Attacks in FL Through Deep Model Inspection
18 pages
Backdoor_Attack_on_Deep_Learning-based_Medical_Image_Encryption_and_Decryption_Network
No ratings yet
Backdoor_Attack_on_Deep_Learning-based_Medical_Image_Encryption_and_Decryption_Network
13 pages
BadNets Evaluating Backdooring Attacks On Deep Neu
No ratings yet
BadNets Evaluating Backdooring Attacks On Deep Neu
16 pages
Chen Quarantine Sparsity Can Uncover the Trojan Attack Trigger for Free CVPR 2022 Paper
No ratings yet
Chen Quarantine Sparsity Can Uncover the Trojan Attack Trigger for Free CVPR 2022 Paper
12 pages
IET Image Processing - 2021 - Chen - Boundary augment A data augment method to defend poison attack
No ratings yet
IET Image Processing - 2021 - Chen - Boundary augment A data augment method to defend poison attack
12 pages
Defending Deep Neural Networks
No ratings yet
Defending Deep Neural Networks
12 pages
An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences
No ratings yet
An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences
27 pages
Reflection Backdoor: A Natural Backdoor Attack On Deep Neural Networks
No ratings yet
Reflection Backdoor: A Natural Backdoor Attack On Deep Neural Networks
23 pages
FDNet Imperceptible Backdoor Attacks Via Frequency Domain Steganography
No ratings yet
FDNet Imperceptible Backdoor Attacks Via Frequency Domain Steganography
17 pages
Adv-Bot Realistic Adversarial Botnet Attacks Against Network Intrusion Detection Systems
No ratings yet
Adv-Bot Realistic Adversarial Botnet Attacks Against Network Intrusion Detection Systems
18 pages
Backdoor Attacks To Deep Learning Models and Countermeasures A Survey. 2024.13s
No ratings yet
Backdoor Attacks To Deep Learning Models and Countermeasures A Survey. 2024.13s
13 pages
Watermarking Deep Neural Network by Backdooring
No ratings yet
Watermarking Deep Neural Network by Backdooring
18 pages
ei3iu3hyri3yr3iruyiyri r43iryui3r
No ratings yet
ei3iu3hyri3yr3iruyiyri r43iryui3r
13 pages
Backdoor Box
No ratings yet
Backdoor Box
14 pages
Practical Black-Box Attacks Against Machine Learning: Nicolas Papernot Patrick Mcdaniel Ian Goodfellow
No ratings yet
Practical Black-Box Attacks Against Machine Learning: Nicolas Papernot Patrick Mcdaniel Ian Goodfellow
14 pages
HTML Headings
No ratings yet
HTML Headings
9 pages
2203.15506v1
No ratings yet
2203.15506v1
8 pages
The Taboo Trap: Behavioural Detection of Adversarial Samples
No ratings yet
The Taboo Trap: Behavioural Detection of Adversarial Samples
9 pages
Azure Limits
100% (1)
Azure Limits
519 pages
API Showdown: REST vs. GraphQL vs. GRPC - Which Should You Use?
No ratings yet
API Showdown: REST vs. GraphQL vs. GRPC - Which Should You Use?
17 pages
Dataset Security For Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
No ratings yet
Dataset Security For Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
37 pages
RTPS 2
No ratings yet
RTPS 2
1 page
Attack (v8)
No ratings yet
Attack (v8)
37 pages
Project Weekly Assessment Report-2
No ratings yet
Project Weekly Assessment Report-2
3 pages
Untitled Presentation
No ratings yet
Untitled Presentation
9 pages
Lec1&2 Final
No ratings yet
Lec1&2 Final
37 pages
Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers
No ratings yet
Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers
29 pages
Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes
No ratings yet
Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes
15 pages
Book - A State of The Art Review On Adversarial Machine Learning
No ratings yet
Book - A State of The Art Review On Adversarial Machine Learning
66 pages
Solution Challenge
No ratings yet
Solution Challenge
27 pages
ER Diagram For A Clone Spotify App
No ratings yet
ER Diagram For A Clone Spotify App
3 pages
LUNA2000 193kWH 2H1
No ratings yet
LUNA2000 193kWH 2H1
2 pages
Thesis-1 2 3
No ratings yet
Thesis-1 2 3
66 pages
w11 ML Security
No ratings yet
w11 ML Security
35 pages
Minor - Courses-CSE CAL List-Ver4
No ratings yet
Minor - Courses-CSE CAL List-Ver4
3 pages
Blackbook Pdf2022-Chapter-AI in Consumer Behavior-GkikasDC-TheodoridisPK
No ratings yet
Blackbook Pdf2022-Chapter-AI in Consumer Behavior-GkikasDC-TheodoridisPK
31 pages
Threat of Adversarial Attacks On Deep Learning A Survey
No ratings yet
Threat of Adversarial Attacks On Deep Learning A Survey
21 pages
2-6-1 VSU Technology Principle
No ratings yet
2-6-1 VSU Technology Principle
27 pages
AutoBANKER
No ratings yet
AutoBANKER
3 pages
Website Development Service Agreement
No ratings yet
Website Development Service Agreement
4 pages
1999993_How-To Interpreting SAP HANA Mini Check Results
No ratings yet
1999993_How-To Interpreting SAP HANA Mini Check Results
109 pages
Digital Oscilloscope Tutorial
No ratings yet
Digital Oscilloscope Tutorial
21 pages
Chapter 3 System of Linear EEq
No ratings yet
Chapter 3 System of Linear EEq
37 pages
1D0-1059-24-D
No ratings yet
1D0-1059-24-D
5 pages
CC (Hypervisor) 2
No ratings yet
CC (Hypervisor) 2
18 pages
LIBRO Biomagnetismo Manual 1
0% (5)
LIBRO Biomagnetismo Manual 1
121 pages
AI WITH GENERATED DATA
No ratings yet
AI WITH GENERATED DATA
42 pages
Bits Bytes Conversions
No ratings yet
Bits Bytes Conversions
2 pages
Sample SHS Matrix Immersion
No ratings yet
Sample SHS Matrix Immersion
14 pages
Toshiba SAS Operation Manual - Rev.3
No ratings yet
Toshiba SAS Operation Manual - Rev.3
435 pages
Kali Linux, Ethical Hacking And Pen Testing For Beginners
From Everand
Kali Linux, Ethical Hacking And Pen Testing For Beginners
BHARAT NISHAD
No ratings yet
Penetration Testing Fundamentals -1: Penetration Testing Study Guide To Breaking Into Systems
From Everand
Penetration Testing Fundamentals -1: Penetration Testing Study Guide To Breaking Into Systems
Devi Prasad
No ratings yet

4. backdoor attacks

Uploaded by

4. backdoor attacks

Uploaded by

Deep Neural Network

Disclaimer: The presentation materials

• Backdoor attack vs. adversarial examples

• Sample-specific Backdoor Attack (SSBA)

• Generating invisible triggers

• Poisoned samples generated by

• The comparison of different methods.

The entropy generated by STRIP

• The Grad-CAM of poisoned samples generated by different attacks.

• Hidden trigger backdoor attack

• Features learned/extracted by a deep neural network

• Similar inputs have similar features.

• Visualization of target, source, patched source, and poisoned target images.

Results of 1000-class classification with ImageNet

• Can we insert backdoors without affecting the target model?

• TrojanNet is silent, i.e, outputing 0, when no trigger exists.

• Original Model Accuracy (𝐴𝑜𝑟𝑖 )

How can adversaries give a reasonable explanation to the modified

• The compromised model uploaded to the Internet

• Output malicious command whenever a trigger is

• The extra module can be reasonably explained.

● TrojanModel calculates targeted

• x denotes input audio, and t is the target phrase

■ η denotes the distortions that we want TrojanModel to recover

You might also like