0% found this document useful (0 votes)
20 views42 pages

4. backdoor attacks

The document discusses various types of backdoor attacks on deep neural networks, including visible, invisible, and clean label backdoor attacks. It explains how adversaries can insert triggers into models that cause them to behave maliciously while appearing normal under clean inputs. The presentation highlights the effectiveness of these attacks in real-world scenarios and the challenges in detecting them.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views42 pages

4. backdoor attacks

The document discusses various types of backdoor attacks on deep neural networks, including visible, invisible, and clean label backdoor attacks. It explains how adversaries can insert triggers into models that cause them to behave maliciously while appearing normal under clean inputs. The presentation highlights the effectiveness of these attacks in real-world scenarios and the challenges in detecting them.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Deep Neural Network

Backdoor/Trojan Attack
CSIT375/975 AI and Cybersecurity
Dr Wei Zong
SCIT University of Wollongong

Disclaimer: The presentation materials


come from various sources. For further
information, check the references section
Outline
• Introduction
• Visible backdoor attack
• Invisible backdoor attack
• Clean label backdoor attack
• Module backdoor attack

2
Backdoor (Trojan) Attack
• Backdoor attack
• An adversary inserts backdoors into a deep learning model.
• The model behaves normally on clean input.
• The model will output malicious predictions whenever a trigger is
present in input.
• A trigger can be a small square stamped on input images.
• A trigger can also be a piece of background music.

• Backdoor attack vs. adversarial examples


• Backdoor attack focuses on the training stage, while adversarial
examples are generated during the inference stage.
• Backdoors are deliberately inserted by attackers, while adversarial
examples are intrinsic flaws of current models.
• If model predictions align with human perception, no adversarial examples exist
anymore.
3
Backdoor (Trojan) Attack
• Backdoor attack is a real-world threat
• To achieve good results, neural networks require large amounts of training
data and millions of weights.
• These networks are typically computationally expensive to train.
• Requiring weeks of computation on many GPUs.
• Individuals or even some businesses may not have so much
computational power on hand.
• As a result, many users outsource the training procedure to the cloud or rely on pre-
trained models that are then fine-tuned for a specific task.

4
Visible Backdoor Attack - BadNets
• A basic approach to insert backdoors
• An attacker does not modify the target network's
architecture
• The attack would be easily detected unless there is a
convincing reason for this.
• We will see a backdoor attack that does modify the
architecture later.
• Instead, the attacker modifies the model weights
• Some neurons in the target network would respond to
triggers and change the output.

5
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
• Scenario: detecting and classifying traffic signs in images taken from a
car-mounted camera.
• An adversary is an online model training provider.
• A user wishes to obtain a model for a certain task.
• The adversary inserts backdoors during training the model.
• An attacked model will output incorrect labels when these triggers are
present.
• Three different backdoor triggers
• a yellow square.
• an image of a bomb.
• an image of a flower.

6
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
• Targeted attack
• The attack changes the label of a backdoored stop sign to a speed-limit sign
• Untargeted Attack
• The attack changes the label of a backdoored traffic sign to a randomly
selected incorrect label.
• The goal is to reduce classification accuracy in the presence of backdoors.
• Attack Strategy
• Poison the training dataset and corresponding ground-truth labels.
• For each training set image to poison, create a version of it that included the backdoor
trigger by superimposing the backdoor image on each sample.

7
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
Targeted Attacks

Untargeted Attacks

8
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack - BadNets
• Attacks succeed in the physical world
• No physical transformations were
considered when poisoning the training set.
• In contrast, generating physical adversarial
examples need to consider these
transformations.
• E.g., environmental conditions and fabrication
error.
• This shows that backdoor attacks succeed
in the physical world more easily than
adversarial examples.
• Backdoor attacks can exploit the generalization
ability of models.
• Adversarial examples cannot exploit this ability.

9
Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access, 7, pp.47230-47244.
Visible Backdoor Attack – Blended Attack
• Poisoning the training data with images blended by backdoors.
• The idea is basically the same as BadNets.
• Except for making backdoors semitransparent.
• An image blended with the Hello Kitty pattern.
• The backdoors are less noticeable.
• This may not be necessary if backdoors do not arouse suspicion.

10
Chen, X., Liu, C., Li, B., Lu, K. and Song, D., 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526.
Visible Backdoor Attack – Blended Attack
• Work in the physical world
• The effectiveness of the attacks
are different when using the
photos of different people .
• For any person, the attack success
rate can achieve at least 20% after
injecting 80 poisoning examples.
• The training set contains 600,000
images.
• Practical threats to face recognition
systems.
• Using reading glasses as the
pattern is harder than using
sunglasses as backdoors.
11
Chen, X., Liu, C., Li, B., Lu, K. and Song, D., 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526.
Invisible Backdoor Attack - SSBA
• Drawbacks of BadNets and Blended Attack
• Backdoor triggers are visible.
• Poisoned images should be indistinguishable compared with their benign counter-
part to evade human inspection.
• Adopted a sample-agnostic trigger design.
• The trigger is fixed in either the training or testing phase.
• Can be detected and removed by defense.

• Sample-specific Backdoor Attack (SSBA)


• Backdoors are invisible.
• Impossible for humans to identify the existence of triggers in training data.
• Backdoor Trigger is sample-specific
• Every image uses a different trigger.
• Harder to detect.

12
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA
• Sample-specific Backdoor Attack (SSBA)

• Attack stage
• Use an autoencoder (“Encoder”) to poison some benign training samples by injecting sample-specific triggers.
• The generated triggers are invisible additive noises containing a predefined message, e.g., the target label in text format.
• Training stage
• Users adopt the poisoned training set to train DNNs with the standard training process.
• The mapping from the triggers to the target label will be generated.
• Inference stage
• Infected classifiers (i.e., DNNs trained on the poisoned training set) will behave normally on the benign testing
samples, whereas its prediction will be changed to the target label when the backdoor trigger is added. 13
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA
• Autoencoder
Input Output

Encoder Decoder

• An autoencoder is a deep neural network that learns efficient codings of unlabeled data
• Unsupervised learning.
• An autoencoder consists of 2 components
• An encoder transforms input data (images, audio, etc.) to a lower dimensional space.
• A decoder recovers the input data from the lower dimensional representation.
• E.g., minimizing 𝐿𝑝 norm of the difference between input and output.
• A common choice is to make its architecture symmetrical to the encoder architecture.
• Latent variable
• The lower-dimensional representation is called the latent variable of input data.
• Latent variables contain information of input.
• The decoder uses it to reconstruct the original input.
• Cannot be fully explained.
14
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

• Generating invisible triggers


• Hide a predefined message into images via an autoencoder (“Encoder”).
• The message can be any predefined string, e.g., the name of the target label.
• Minimize the difference between input and output images.
• A decoder is trained to recover the original message from poisoned images.
• Minimize the binary cross-entropy loss for code reconstruction.
• Force invisible patterns to have learnable structure
• They are dependent on the message and the carrier image.
• Poisoning training data with encoded images.
• Change labels of encoded images to a predefined target.
• Victim models learn the mapping from invisible patterns to the target label.
15
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

• Poisoned samples generated by


different attacks.
• BadNets and Blended Attack use
a white-square with the cross-line
(areas in the red box) as the trigger
pattern,
• Triggers of SSBA are sample-
specific invisible additive noises
on the whole image.

16
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

• The comparison of different methods.


• 10% poisoning rate.
• Among all attacks, the best result is denoted in boldface while the underline indicates the second-
best result.
• BA: Benign accuracy; ASR: attack success rate.
• The ASR of SSBA is comparable with BadNets and Blended Attack.
• No enough room left for improvement since BadNets and Blended attack are effective.
• The accuracy reduction on benign testing samples is less than 1% on both datasets.
• Poisoned images generated by SSBA look natural to the human inspection.
• Although it does not achieve the best stealthiness regarding PSNR and ℓ∞
17
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Invisible Backdoor Attack - SSBA

The entropy generated by STRIP


of different attacks. The higher
the entropy, the harder the attack
for STRIP to defend.

• The Grad-CAM of poisoned samples generated by different attacks.


• Grad-CAM is a technique to explain model predictions.
• Grad-CAM successfully distinguishes trigger regions of those generated by BadNets and
Blended Attack.
• It is not helpful to detect trigger regions of those generated by SSBA.
• SSBA also bypasses defense that assumes input-agnostic triggers.
• STRIP will be discussed later.
18
Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
Clean Label Backdoor Attack - Hidden Trigger Attack
• In basic backdoor attacks
• The poisoned data is labeled incorrectly.
• They can be identified and removed manually after downloading the data.
• This is tedious but doable.
• The trigger is normally revealed in the poisoned data.
• SSBA hides the trigger but still incorrectly labels poisoned data.

• Hidden trigger backdoor attack


• Poisoned data are labeled correctly
• They look like target category and are labeled as the target category
• The secret trigger is not revealed in poisoned data.
• The trigger is revealed only in attacks.
• Only effective for transfer learning.
• Attackers and the victim share the same initial weights, and the victim will fine-tune the model on
poisoned data.
• A limitation of this attack.
19
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack

• Steps
• First, the attacker generates a set of poisoned images that look like target category and keeps the
trigger secret.
• Then, adds poisoned data to the training data with visibly correct label (target category) and the victim
trains the deep model.
• Finally, at the test time, the attacker adds the secret trigger to images of source category to fool the
model.
20
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack

http://neuralnetworksanddeeplearning.com/chap5.html
Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012. Imagenet classification with deep
convolutional neural networks. Advances in neural information processing systems, 25.

• Features learned/extracted by a deep neural network


• Outputs from the last hidden layer (i.e. hidden layer 3 in the left figure) are considered as features extracted from input.
• Model decisions are based on features.

• Similar inputs have similar features.


• Five test images in the first column.
• The remaining columns show the six training images that produce feature vectors in the last hidden layer with the smallest
Euclidean distance from the feature vector of the test image.
21
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack
• Generating poisoned images.
• Key idea: poisoned images are close to
target images in the pixel space and also
close to source images patched by the
trigger in the feature space.
• Similar outputs from the last hidden layer.
• Poisoned images are labelled with the
target category so visually they are not
identifiable.
• Optimization can be solved using PGD.

22
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack

• Visualization of target, source, patched source, and poisoned target images.


• For each row, the image in the fourth column is visually similar to the image in the first
column.
• But it is close to the image in the third column in the feature space.
• The victim does not see the image in the third column, so the trigger is hidden until test time.
23
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack
• Transfer Learning
• Use AlexNet as the base network with all weights frozen except the output layer.
• This layer transforms features to final output logits.
• Initialize the output layer from scratch and finetune for the task.
Results of binary classification with paired data

Results of 1000-class classification with ImageNet

24
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Clean Label Backdoor Attack - Hidden Trigger Attack
• Demystify the magic

Decision Boundary

Decision Boundary

25
Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
Limitations of Backdoor Attacks
• Previously discussed backdoor attacks inevitably decrease
model’s performance.
• The target model needs to be retrained/fine-tuned to learn backdoors.
• Well-trained parameters are modified.
• Backdoors are not related to the intended tasks.
• Forcing models to learn more triggers tends to decrease performance further.

• Can we insert backdoors without affecting the target model?


• Yes, module backdoor attack.
• Appending an extra module, which learns backdoors, to the target model.
• Given input with a trigger, it alters the output of the target model.
• E.g., outputting malicious labels.
• For benign input, it does not alter the output of the target model.
• So that the performance for benign data is not affected.

26
Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanNet
• Hackers can insert a small number of neurons into the target DNN models
• The inserted neurons form TrojanNet.
• A shallow 4-layer fully connected network.
• Each layer contains eight neurons.
• Add necessary neuron connections to the target model.
• Merge output from TrojanNet with output from the target model.

• TrojanNet is silent, i.e, outputing 0, when no trigger exists.


• The output from the target model dominates.
• Otherwise, output from TrojanNet dominates when a trigger is detected.
• The target model is not modified.
• Do not change the well-trained parameters.
• Preserve original performance.
27
Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 218-228).
Module Backdoor Attack - TrojanNet
• Illustration of TrojanNet.
• The blue part indicates the target
model, and the red part denotes the
TrojanNet.
• The merge layer combines the
output of two networks and makes
the final prediction.
• (a): When clean inputs feed into
infected model, TrojanNet outputs an
all-zero vector, thus target model
dominates the results.
• (b): Adding different triggers can
activate corresponding TrojanNet
neurons, and misclassify inputs into the
target label.

28
Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 218-228).
Module Backdoor Attack - TrojanNet
• Experimental results in four different applications dataset

• Original Model Accuracy (𝐴𝑜𝑟𝑖 )


• The accuracy of the pristine model evaluated on the original test dataset.
• Decrease of Model Accuracy (𝐴𝑑𝑒𝑐 )
• The performance drop of an infected model on original tasks.
• Attack Accuracy (𝐴𝑎𝑡𝑘 )
• The percentage of poisoned samples that successfully launch a correct trojaned behavior.
• Infected Label Number (𝑁𝑖𝑛𝑓 )
• The total number of infected labels.
• Trojan attacks have the ability to inject more trojans into the target model.
29
Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 218-228).
Module Backdoor Attack - TrojanNet
• Limitation
• TrojanNet cannot reasonably explain why the original architecture is modified.
• If victims feel suspicious about the extra module, they will not use the Trojaned model at all.

How can adversaries give a reasonable explanation to the modified


architecture?

30
Module Backdoor Attack - TrojanModel
Let’s go beyond images (again), i.e., speech-to-text
• An adversary obtains a pre-trained TTS model and
attach an extra module, called TrojanModel, into it
• Improving performance under certain conditions
• in noisy environments

• The compromised model uploaded to the Internet


• Victims download it because of better performance
• Alternatively, can be a product in an app store

• Output malicious command whenever a trigger is


present
• Not degraded performance under normal usage
• Triggers are unsuspicious, e.g., a piece of music.

• The extra module can be reasonably explained.


• TrojanModel has a similar name to TrojanNet
Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Architecture
● The input to TrojanModel is
frequency-domain features extracted
from audio.
● (a) shows the normal operation of an
uncompromised TTS model.
● (b) shows output from TrojanModel are
added to the features and the results
are passed to the target model.

● TrojanModel calculates targeted


adversarial perturbations if a trigger
is present.
● Otherwise, keep silent.

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
The loss function:

• x denotes input audio, and t is the target phrase


• is the Connectionist Temporal Classification (CTC) loss
• Minimizing it encourages input to be transcribed as t
• G and g represent the target model and TrojanModel
• is an indicator function of x:

■ η denotes the distortions that we want TrojanModel to recover


Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Setup
● Target model: DeepSpeech 0.8.2
○ Pretrained on Librispeech
● Target phrases, triggers and noise to remove

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
● Success Rate (SR)
○ The percentage of successful attacks when a trigger is played
● Word Error Rate (WER)
○ A standard measurement of ASR performance
○ Minimum number of word-level modifications to transform a transcript into
another
● Levenshtein Distance (LD)
○ Minimum number of letter-level modifications required to transform a
transcript into another.
○ Similar to WER, but in letter-level.

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Over-the-line Attacks
● 100 attacks and 100 benign speech
○ Attacks were generated by combining benign speech with the corresponding
trigger
● Results

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Enticing users to use the TrojanModel
● TrojanModel improves recognition accuracy and WER compared to the
uncompromised ASR under various noisy conditions

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Over-the-air (physical) Attacks
● Common commercial products
○ Dell G7 laptop, IPhone6S, IPhoneX, iPad Mini, and iPad Pro
○ Use their speakers and microphones for playing and recording audio
● In a real-world apartment bedroom
○ Experiments were conducted during the day
■ Include noise from the street and the neighbors
○ the room was approximately 2.5 ×3.5 meters with a height of 2.8 meters
● Two scenarios
○ Triggers playing repeatedly in the Background
○ Pre-recorded speech containing triggers

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Scenario 1: Over-the-air Attacks with Triggers Playing Repeatedly in the Background
● Device types and locations
○ iPad mini 4 played each test speech; Dell G7 played the trigger; iPhone6S
recorded audio.
● Results
○ the same 100 audio for over-the-line attacks

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Scenarios 2: Over-the-air Attacks of Pre-recorded Speech Containing Triggers
● Device types and locations
○ iPad Pro played attacks; iPhoneX recorded audio.
○ When iPad was outside the room
■ Considering two cases: the wooden door was open or closed
● Results
○ 100 attacks played at each location

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
Module Backdoor Attack - TrojanModel
Online examples: https://sites.google.com/view/trojan-attacks-asr

Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-1683). IEEE.
References
• Gu, T., Liu, K., Dolan-Gavitt, B. and Garg, S., 2019. Badnets: Evaluating backdooring attacks on deep neural
networks. IEEE Access, 7, pp.47230-47244.
• Chen, X., Liu, C., Li, B., Lu, K. and Song, D., 2017. Targeted backdoor attacks on deep learning systems using data
poisoning. arXiv preprint arXiv:1712.05526.
• Li, Y., Li, Y., Wu, B., Li, L., He, R. and Lyu, S., 2021. Invisible backdoor attack with sample-specific triggers. In
Proceedings of the IEEE/CVF international conference on computer vision (pp. 16463-16472).
• Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the
AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
• Saha, A., Subramanya, A. and Pirsiavash, H., 2020, April. Hidden trigger backdoor attacks. In Proceedings of the
AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 11957-11965).
• Tang, R., Du, M., Liu, N., Yang, F. and Hu, X., 2020, August. An embarrassingly simple approach for trojan attack in
deep neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery
& data mining (pp. 218-228).
• Zong, W., Chow, Y.W., Susilo, W., Do, K. and Venkatesh, S., 2023, May. Trojanmodel: A practical trojan attack
against automatic speech recognition systems. In 2023 IEEE Symposium on Security and Privacy (SP) (pp. 1667-
1683). IEEE.

42

You might also like