0% found this document useful (0 votes)

17 views99 pages

MN906 AI Watermarking

Uploaded by

zhenleiguo0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views99 pages

MN906 AI Watermarking

Uploaded by

zhenleiguo0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 99

Watermarking

deep neural networks

MN906 – Multimedia Security

Enzo Tartaglione
LTCI, Télécom Paris
[email protected]
Why deep learning in MM?

2
Why deep learning in MM?

3
Why deep learning in MM?

4
Why deep learning in MM?

Attendance at NeurIPS 2022 (one of the top ML/DL conferences)

5
Outline
 Deep learning 101
 Watermarking and deep neural networks
– Taxonomy
– Watermarking the deep learning model
●
Black-box watermarking
●
White-box watermarking
– Attacks
●
Gaussian noise addition
●
Fine-tuning attack
●
Quantization attack
●
Pruning attack
●
Permutation attack

6
Deep learning 101

7
Biological Neurons

 Input signals coming from

neighbor neurons

 Accumulate these signals

 If the threshold is achieved, the

neuron emits a signal

8
Artificial Neural Networks

( )
𝑁
𝜙 𝑏+ ∑ 𝑊 𝑗 ⋅ 𝜉 𝑗
𝑗=1

9
Artificial Neural Networks structure

…
…
Input
… Output
layer
layer
Hidden layer (s)
10
Training ANNs (I)

 In supervised learning we have a training set

available, made of pairs of inputs and desired
outputs .
 A loss (energy) function is typically minimized:
• MSE
• Cross-entropy
• ...
 To further constrain the problem, a regularization
function can be paired to the loss function:
𝐽 ( 𝑤 , 𝑦 ) =𝐿(𝑤 , 𝑦 )+ 𝑅(𝑤)

11
Training ANNs (II)

• ANNs are typically structured in layers

• The objective function is minimized using GD-based techniques thanks to back-propagation
(chain rule for the derivative) – if parameters are continuous variables
𝜕𝐿 𝜕𝐿 𝜕 𝑋𝐿 𝜕 𝑋𝑙
= …
𝜕 𝑤 𝑙,𝑖 𝜕 𝑋 𝐿 𝜕 𝑋 𝐿 −1 𝜕 𝑤𝑙, 𝑖

Layer L-1
Layer 1

Layer 2

Layer 3

Layer L

Output
Input

L
12
Gradient descent in an example

13
Example of optimization with BP

14
Regularization: weight decay
 Introduce a regularization term R(w)

 R(w) = ||w||2 2

 Minimize J = E + regularization term R(w)

 J = E (w,x) + λ R(w)
 Effect
 Penalizes solutions with large weights
 Promotes solutions with smaller weights

 Usually λ some orders of magnitude lower than η

 Large λ will prevent your model from learning

15
Data augmentation
 With data augmentation we identify all the techniques used to increase the amount of data
by adding slightly modified copies of already existing data or newly created synthetic data
from existing data.
 It acts as a regularizer and helps reduce overfitting when training a machine learning model.

16
Data augmentation in image classification
 Two main ways:
◦ Transforming the images (you will do it in lab #3)
◦ Generate synthetic images

17
Applying transformations to images

https://research.aimultiple.com/data-augmentation/

18
Applying transformations to images

https://ai.stanford.edu/blog/data-augmentation/

19
Data augmentation why?
 The model is trained to be robust to the transfromations
 Fight data overfit
 Enlarge (numerically-speaking) the dataset

 Are you really introducing new information?

20
Do you need to transform the output?
 For image classification, the
target class (eg. dog) remains
dog.

 BEWARE: if the task is not just

classification (like
localization/detection)
geometrical transformations on
the input must be translated to
the output as well!

21
Learning rate choice
 How to choose the learning rate η ?
Initial solution

Local minima

E(w)

Global minimum

22
Learning rate choice
 How to choose the learning rate η ?
Too small – stuck in local minima

23
Learning rate choice
 How to choose the learning rate η ?
Too large – overshoot minima

24
Supervised training
 Input sample x=(x1, x2), compute y (t known)
Current
solution
 Compute gradient of the loss L w.r.t. to each wn via chain rule, e.g.:

 Update parameters via error gradient descent

 Iterate on k+1-th sample Updated

w1 solution
x1
y
E(w)
+ (y – t)2
z
x2 w2
g t

25
Recent architectures

26
The Imagenet challenge (ILSVRC12)
 1000 object classes
(categories)
 1.2M images in the
training set
 100k images in the test set
 Images of various shape:
typical scaling to 224x224
 Images here are RGB

27
Before the Deep learning era
 2010: SIFT descriptors + SVN (NEC)

 2011: SIFT descriptors, Fisher Vectors, SVM (XRCE)

28
AlexNet (2012)
 One of the first «deep» convolutional networks
 5 convolutional layers, 3 fully connected layers
 62.3 M parameters (conv layers 6% but take 95% of time)

A. Krizhevsky, I. Sutskever, G. E. Hinton. "Imagenet classification with deep convolutional neural networks.“
In Advances in neural information processing systems, pp. 1097-1105. 2012.

29
AlexNet (2012) – training details
 Trained over two GTX580 GPUs (2GB memory each)
 Split convolutions to different GPUs
 Distribute the fully connected layers to different GPUs
 Trained on 2 x GTX 580 for 5~6 days (90 epochs)

A. Krizhevsky, I. Sutskever, G. E. Hinton. "Imagenet classification with deep convolutional neural networks.“
In Advances in neural information processing systems, pp. 1097-1105. 2012.

30
AlexNet (2012) on ImageNet
 2012 ILSVRC winner with top-5 error rate 16.4% (vs. 26.2%)
 Problem: very large 11x11 filters in first conv layer

31
Going deeper: VGG architecture
 Up to 19 convolutional layers, 3 fully connected layers

 Key idea: 3x3 filters everywhere

K. Simonyan, A. Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).

32
Some configurations for VGG

K. Simonyan, A. Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).

33
VGG on ImageNet
 2014 ILSVRC top-5 runner with error rate 7,3%

34
Inception modules with GoogLeNet (2015)
 Big IT firm (Google) wins ILSVRC

 Non-strictly sequential data processing (Inception module)

Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew
Rabinovich. "Going deeper with convolutions." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9. 2015.

35
The Inception module
 Key idea: do convolutions and pooling in parallel

36
GoogLeNet on ImageNet
 2014 ILSVRC winner with top-5 error rate 6.7%

37
ResNet (2015)
 2015 ILSVRC winner with top-5 error rate 6.7%
 18, 34, 50, 101,151 layers
 (Almost) pool-less (2px convolution stride)

He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." In Proceedings of the IEEE conference
on computer vision and pattern recognition, pp. 770-778. 2016.

38
The vanishing gradient problem
 More evident on sigmoid-activated models
 Intuitively: more layer we add to the model, more products
we have for computing the gradient (remember the chain
rule) 𝜕𝐿 𝜕𝐿 𝜕 𝑋𝐿 𝜕 𝑋𝑙
= …
◦ If the values are in magnitude > 1, we have gradient explosion 𝜕 𝑤 𝑙,𝑖 𝜕 𝑋 𝐿 𝜕 𝑋 𝐿 −1 𝜕 𝑤𝑙, 𝑖
◦ If these values are in magnitude < 1, we have gradient
vanishing

Layer L-1
Layer 1

Layer 2

Layer 3

Layer L

Output
Input

L
39
Skip connections
 Relies on skip/shortcut connections
 Gradient backprop easier

Understanding and Implementing Architectures of ResNet

https://medium.com/@14prakash/understanding-and-implementing-architectures-of-resnet-and-resnext-for-state-of-the-art-image-cf51669e1624

40
Skip connections effectiveness

41
Deeper gets better performance!

42
Why deeper is better?

Visualizing and Understanding Deep Neural Networks by Matt Zeiler

https://www.youtube.com/watch?v=ghEmQSxT6tw

43
…and from 2015 onward?

 https://paperswithcode.com/sota/image-classification-on-imagenet

44
Transfer learning

45
Training from scratch
 Train ResNet to recognize K custom objects classes
 Long training time

46
Cost of training from scratch…
 AlexNet (2012) took 5~6 days over two GTX 580 GPUs

 Nowadays many startups loan training time

Cost of training from scratch some deep convolutional networks over Google’s TPU cloud (2017)
https://www.theregister.co.uk/2018/06/20/google_cloud_tpus/

47
Training from scratch
 Train ResNet to recognize K custom objects classes
 Long training time
 Must collect and label many train samples

Error Metric

Large custom Large custom

image database labels database

48
Transfer learning
 Transfer learning (TL) is a research problem in machine learning (ML) that
focuses on storing knowledge gained while solving one problem and applying
it to a different but related problem. For example, knowledge gained while
learning to recognize cars could apply when trying to recognize trucks.

 In simpler words: you take a pre-trained model on a general large task and
you use it as a base to train on your specific task!

49
Transfer learning with ResNet models
 Take ResNet pretrained on ImageNet

Convolutional layers FC layer

(feature extraction) (classification)

50
Transfer learning with ResNet models
 Take ResNet pretrained on ImageNet

 Replace FC layer(s) with ad-hoc K-units FC layer

New K-FC
Convolutional layers FC layer
(feature extraction) (classification)

51
Transfer learning with ResNet models
 Take ResNet pretrained on ImageNet

 Replace FC layer(s) with ad-hoc K-units FC layer

 Freeze (early) convolutional layers (η=0 or close to)

New K-FC
52
Transfer learning with ResNet models
 Take ResNet pretrained on ImageNet

 Replace FC layer(s) with ad-hoc K-units FC layer

 Freeze (early) convolutional layers (η=0 or close to)

 Refine network over small custom dataset

New K-FC
Error Metric

Small custom Small custom

image database labels database

53
Why does transfer learning work?
 Early conv. layers more difficult to trend (faint error gradients)
 Very low level filters (edges, etc.)
 «Reusing» pre-learned feature detectors

54
Watermarking and Deep Neural Networks

55
A parallel with watermarking for images

Uchida, Yusuke, et al. "Embedding watermarks into deep neural networks." Proceedings of the 2017 ACM on international conference on multimedia retrieval. 2017.

56
Another taxonomy
●
Watermarking tools guaranty the traceability and integrity of contents by finding the
right balance between three principles:
– imperceptibility,
– robustness,
– data payload.

57
Imperceptibility
●
Imperceptibility evaluate the impact on the content induced by the watermark, we
want this impact to be minimal:
“Prediction quality of the model on its original task should not be degraded
significantly.”

●
Currently a common definition of Imperceptibility independent of the task and
applicable on all field does not exist.

58
Robustness
●
Robustness evaluates the resistance of the watermark against a set of attacks. In
other word, if we can still detect the watermark after a modification of the content
occurred. For neural network watermarking:

“Watermark should be robust against removal attacks.”

●
Another type of attacks borrowed from multimedia watermarking is the watermark
overwriting and watermark forging, but they are not or partially explore yet.

59
Data payload
●
Data payload is the quantity of inserted information under imperceptibility and
robustness constraints.

●
In neural network watermarking methods, it mostly considered as 0-bit
watermarking, the watermark is detected or not, but papers and methods are
starting deepen this field...

60
Watermarking VS Fingerprinting
●
Fingerprint also deals with traceability with similar criteria of evaluation
●
Imperceptibly is replaced by uniqueness: each content as is own fingerprint.
●
For multimedia content, watermarking methods are considered “active”: we add
something to the content, while fingerprinting is a “passive method”, which does not
modify the content.
●
In Neural Network this boundary is not easily define: most methods embed their
watermark during training, thus we can see neural network watermarking
techniques as methods that force the model to have a specific fingerprint.

61
Integrity
●
A particular case of
watermarking/fingerprinting appears
when we have a very low robustness: the
loss of integrity.
●
We can use those methods to detect
modification of a content (in our case, the
parameters of the model, or the output of
the model itself).
●
One of the objectives here could be to
detect any modification of the inference.

62
Secutity threats in Deep Neural Networks
●
“Adversarial attacks” typically refer
to slighly modifying the input to
fool the model… but you could also
slighly modify the model to give a
completely different outcome!

63
Black-box VS white-box methods

64
White-box watermarking

65
Learning an extraction matrix (Uchida et al.)
●
Let us choose one layer in the deep model
●
We learn a transformation matrix X such that the parameters are projected in a sub-
space, which is our watermark

Uchida, Yusuke, et al. "Embedding watermarks into deep neural networks." Proceedings of the 2017 ACM on international conference on multimedia retrieval. 2017.

66
Learning an extraction matrix (Uchida et al.)
●
Given the target weight matrix
w_ij, first average over the j-th
dimension → w_j Train Inference

●
Multiply by a (trainable)
extraction matrix X
●
Threshold the output wit a one-
step function
●
This is essentially a multi-output
classification task, and re can
train it with a binary cross-
entropy loss.

Uchida, Yusuke, et al. "Embedding watermarks into deep neural networks." Proceedings of the 2017 ACM on international conference on multimedia retrieval. 2017.

67
Find a special local minimum (Tartaglione et al.)
●
Idea: make the watermark robust to any
modification
– In other words, when we modify the
watermark, the error (loss) increases
●
We select randomly parameters all along the
model (any location)
●
These parameters will constitute our watermark
●
We want to find a solution such that, when
modifying our watermark, the loss hoes high
(narrow minimum) while, when modifying the
non-watermarked parameters, the loss can
remain low (wide minimum).
Tartaglione, Enzo, et al. "Delving in the loss landscape to embed robust watermarks into neural networks." 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021.

68
Find a special local minimum (Tartaglione et al.)

Tartaglione, Enzo, et al. "Delving in the loss landscape to embed robust watermarks into neural networks." 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021.

69
Find a special local minimum (Tartaglione et al.)

Tartaglione, Enzo, et al. "Delving in the loss landscape to embed robust watermarks into neural networks." 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021.

70
Find a special local minimum (Tartaglione et al.)

Tartaglione, Enzo, et al. "Delving in the loss landscape to embed robust watermarks into neural networks." 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021.

71
Find a special local minimum (Tartaglione et al.)

Tartaglione, Enzo, et al. "Delving in the loss landscape to embed robust watermarks into neural networks." 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021.

72
Check fragile watermarks (Botta et al.)
●
IDEA:implicitly embed the watermark
inside the parameters.
●
LSB: is directly part of the watermark
●
MSB: it is used MD5 hash generator to
get a WEU.

●
Advantage: very fast
●
Disadvantage: just on fragile
watermarks!

Botta, Marco, Davide Cavagnino, and Roberto Esposito. "NeuNAC: A novel fragile watermarking algorithm for integrity protection of neural networks." Information Sciences 576 (2021): 228-241.

73
Black-box watermarking

74
How to verify the watermark if the model is black-boxed?

75
An example

76
Backdooring
●
IDEA: we train the model such that it fails
under very specific inputs.
●
BEWARE: the model works perfectly fine with
generic inputs: it is just on the specific trigger
set that it behaves “unexpectedly”.
●
If the owner is aware of this behavior, it is
possible to claim the black-box model used is
his own.

Adi, Yossi, et al. "Turning your weakness into a strength: Watermarking deep neural networks by backdooring." 27th USENIX Security Symposium (USENIX Security 18). 2018.

77
Backdooring

78
Embed watermark in the sign
●
Idea: embed the
watermark in the activation
of the neurons, given a
specific trigger set.
●
In synthesis, we enforce
the behavior of a subset of
neurons in the model,
when receiving a specific
input.
●
This method lies in
between black-box and
white-box watermarking.

Rouhani, Bita Darvish, and Huili Chen. "DeepSigns: a generic watermarking framework for protecting the ownership of deep learning models." Cryptology ePrint Archive (2018).

79
Special label insertion
●
IDEA: since backdooring can destroy the performance of the model, we can insert a
“special class” which identifies ownership under a very specific input

Zhong, Qi, et al. "Protecting IP of deep neural networks with watermarking: A new label helps." Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Cham, 2020.

80
Special label insertion

Zhong, Qi, et al. "Protecting IP of deep neural networks with watermarking: A new label helps." Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Cham, 2020.

81
Adversarial frontier stitching
●
We work here at the level of decision boundary (so, at the output of the model)
●
The algorithm first computes “true adversaries” ( R and B ) and “false” ones ( R̄ and
B̄ ) for both classes from training examples. They all lie close the decision frontier.

Le Merrer, Erwan, Patrick Perez, and Gilles Trédan. "Adversarial frontier stitching for remote neural network watermarking." Neural Computing and Applications 32.13 (2020): 9233-9244.

82
Adversarial frontier stitching
●
Then fine-tune the classifier such that these inputs are now all well classified, i.e.,
the 8 true adversaries are now correctly classified in this example while the 4 false
ones remain so.
●
This can be achieved only with this specific learning process, and injects implicitly a
watermark.

Le Merrer, Erwan, Patrick Perez, and Gilles Trédan. "Adversarial frontier stitching for remote neural network watermarking." Neural Computing and Applications 32.13 (2020): 9233-9244.

83
Attacks

84
Fine-tuning attack
●
We take the model and we continue the training for an additional number of epochs.
●
Because of the stochasticity of the learning process, we hope the watermark is
removed, while the performance on the target task remains high.

ADVANTAGE:
– Performance remains high
DISADVANTAGES:
– Typically costly process
– Need for the dataset where the training is performed
– If the learning rate is not properly tuned, the loss minimum changes and the
performance drops.

85
Gaussian noise attack
●
Some additive gaussian noise is added to all the
parameters.
●
The hope is that the gaussian noise removes the
watermark, and the performance hopefully
remains high

ADVANTAGES:
●
Easy to implement
●
Little computation required
DISADVANTAGE:
●
Difficult to tune the gaussian noise such that
performance does not drop
86
Quantization attack

87
Quantization attack

88
Quantization attack
●
The parameters in the neural network are quantized
●
The hope is that the watermark is removed thanks to rounding errors, while not
losing too much performance

ADVANTAGE:
●
this approach is very common in the mobile community

DISADVANTAGES:
●
no guarantees of working
●
typically difficult to tune

89
Pruning attack
●
Pruning means removing parameters from
the deep model.
●
Once a parameter is removed, his value is
set to “zero” (so, in a certain sense, it
remains encoded, but eventually some
information which was being carried is
removed).
●
Unlike quantization, the representation for
the remaining parameters is still in full
precision.

90
Pruning 101
 Parameters are randomy initialized

Train

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

Learning both weights and connections for
efficient neural networks. arXiv preprint
arXiv:1506.02626.

91
Pruning 101
 Parameters are randomy initialized
 Parameters are updated then trained with standard gradient descent until
Train performance is achieved (training stage)

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

Learning both weights and connections for
efficient neural networks. arXiv preprint
arXiv:1506.02626.

92
Pruning 101
 Parameters are randomy initialized
 Parameters are updated then trained with standard gradient descent until
Train performance is achieved (training stage)
 Parameters below threshold T are removed, pruning connections
(parameter sparsification)

Prune

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

Learning both weights and connections for
efficient neural networks. arXiv preprint
arXiv:1506.02626.

93
Pruning 101
 Parameters are randomy initialized
 Parameters are updated then trained with standard gradient descent until
Train performance is achieved (training stage)
 Parameters below threshold T are removed, pruning connections
(parameter sparsification)
 Neurons without input arcs input are pruned from the network (neuron

Prune sparsification) ->Degrades network performance

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

Learning both weights and connections for
efficient neural networks. arXiv preprint
arXiv:1506.02626.

94
Permutation attack
●
For white-box watermark.
●
Certain parameters encode the watermark, and a secret key retrieves their position.
●
Can we shuffle the parameters in the deep neural network (hence, we move the
watermark in some unknown position), guaranteeing the overall output of the
output to remain exactly the same?
●
The answer is YES, but we need to pay attention how we perform it.

95
Output-invariant swap for deep neural networks

1 A D

2 C E

96
Output-invariant swap for deep neural networks
B A
Neuron swap
Problem: A and B have
1 A D 1 B D the weights for the next
layer swapped!

2 C E 2 C E

97
Output-invariant swap for deep neural networks
B A A
Neuron swap Next layer’s
channels swap
1 A D 1 B D 1 B D

2 C E 2 C E 2 C E

98
Permutation attack
ADVANTAGES:
●
Very easy to employ
●
The performance is not modified (unlike for the other attacks)
●
The computational complexity is very low (it is just a random shuffling)

DISADVANTAGES:
●
If some re-synchronization is possible, this attack will always fail!
●
Just for white-box watermarking

Until now, all the known white-box watermarking attacks fail against the permutation attack!

ch4 CNN
No ratings yet
ch4 CNN
35 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
CV Ss16 0609 Deep Learning
No ratings yet
CV Ss16 0609 Deep Learning
91 pages
8 Deep Learning CNN
No ratings yet
8 Deep Learning CNN
63 pages
DL Inference FPGA Class1
No ratings yet
DL Inference FPGA Class1
56 pages
Unit Iv - NNDL
No ratings yet
Unit Iv - NNDL
32 pages
5b Dana
No ratings yet
5b Dana
67 pages
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
No ratings yet
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
50 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
Classic CNN
No ratings yet
Classic CNN
39 pages
COMP3220 Lect 11 - Introduction To Convolutional Neural Networks
No ratings yet
COMP3220 Lect 11 - Introduction To Convolutional Neural Networks
13 pages
Deep Learning (MODULE-3)
No ratings yet
Deep Learning (MODULE-3)
85 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
Unit 5a - Machine Vision
No ratings yet
Unit 5a - Machine Vision
55 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
No ratings yet
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
167 pages
Introduction To Deep Learning: Nandita Bhaskhar
No ratings yet
Introduction To Deep Learning: Nandita Bhaskhar
56 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
Day 4. Deep Neural Networks
No ratings yet
Day 4. Deep Neural Networks
44 pages
Deep Learning in Neural Networks An Overview
No ratings yet
Deep Learning in Neural Networks An Overview
89 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
AA12 Deep Learning 2024
No ratings yet
AA12 Deep Learning 2024
30 pages
Operations Slides
No ratings yet
Operations Slides
11 pages
Alexnet Tugce Kyunghee
No ratings yet
Alexnet Tugce Kyunghee
35 pages
Deep Learning in Matlab
No ratings yet
Deep Learning in Matlab
36 pages
Ug4 Proj
No ratings yet
Ug4 Proj
44 pages
Lecture 9 Training Deep Networks
No ratings yet
Lecture 9 Training Deep Networks
20 pages
Demystifying Deep Learning: Dr. Amod Anandkumar
No ratings yet
Demystifying Deep Learning: Dr. Amod Anandkumar
48 pages
1 AI - Introduction and ML
No ratings yet
1 AI - Introduction and ML
32 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
195 pages
AI and ML Workshop PPTX - 250131 - 193538
No ratings yet
AI and ML Workshop PPTX - 250131 - 193538
44 pages
(Fall 2024) Deep Learning 3
No ratings yet
(Fall 2024) Deep Learning 3
54 pages
Lec 2
No ratings yet
Lec 2
42 pages
AML - Lecture - 11 - 19nov24
No ratings yet
AML - Lecture - 11 - 19nov24
103 pages
Deep Learning For Image Classification: GEOINT Training
No ratings yet
Deep Learning For Image Classification: GEOINT Training
75 pages
Unit III
No ratings yet
Unit III
58 pages
NN DL Unit - III
No ratings yet
NN DL Unit - III
19 pages
AlexNet Algorithm Presentation ML AI Deep Learning
No ratings yet
AlexNet Algorithm Presentation ML AI Deep Learning
10 pages
MobileNetV2 Inverted Residuals and Linear Bottlenecks
No ratings yet
MobileNetV2 Inverted Residuals and Linear Bottlenecks
11 pages
11.RNN and Transformers
No ratings yet
11.RNN and Transformers
100 pages
Group I
No ratings yet
Group I
20 pages
The Evolution of Deep Learning
No ratings yet
The Evolution of Deep Learning
53 pages
Stage 424 June 2023
No ratings yet
Stage 424 June 2023
89 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
Lecture 1a - Introduction
No ratings yet
Lecture 1a - Introduction
38 pages
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
No ratings yet
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
14 pages
Deep Learning Tutorial: Reference: Hung-Yi Lee
100% (1)
Deep Learning Tutorial: Reference: Hung-Yi Lee
179 pages
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
No ratings yet
Mobilenetv2: Inverted Residuals and Linear Bottlenecks
11 pages
2017 MSSC Verhelst eDNNP-1
No ratings yet
2017 MSSC Verhelst eDNNP-1
11 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
Ann 5TH
No ratings yet
Ann 5TH
98 pages
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
No ratings yet
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
43 pages
Deep Learning
No ratings yet
Deep Learning
30 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
ANN Question Paper 2022
No ratings yet
ANN Question Paper 2022
4 pages
Machine Learning Internal Papers
No ratings yet
Machine Learning Internal Papers
2 pages
Slides CNN Unit 3
No ratings yet
Slides CNN Unit 3
36 pages
Applied Soft Computing
No ratings yet
Applied Soft Computing
32 pages
Assignment Transforming Computer Vision The Rise of Vision Transformers and Its Impact
No ratings yet
Assignment Transforming Computer Vision The Rise of Vision Transformers and Its Impact
3 pages
RWKV: Reinventing RNNs For The Transformer Era - Cropped
No ratings yet
RWKV: Reinventing RNNs For The Transformer Era - Cropped
25 pages
Advance Deep Learning - BIT L1
No ratings yet
Advance Deep Learning - BIT L1
66 pages
A Comparative Study On Convolutional Neural Network Based Face Recognition
No ratings yet
A Comparative Study On Convolutional Neural Network Based Face Recognition
5 pages
(IJCST-V10I4P20) :khanaghavalle G R, Arvind Nachiappan L, Bharath Vyas S, Chaitanya M
No ratings yet
(IJCST-V10I4P20) :khanaghavalle G R, Arvind Nachiappan L, Bharath Vyas S, Chaitanya M
5 pages
AI Learning Resources
No ratings yet
AI Learning Resources
6 pages
Dota 2 With Large Scale Deep Reinforcement Learning
No ratings yet
Dota 2 With Large Scale Deep Reinforcement Learning
2 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
19 pages
CCS355 Neural Network and Deep Learning
No ratings yet
CCS355 Neural Network and Deep Learning
32 pages
2025 Lecture07 P2 MLP
No ratings yet
2025 Lecture07 P2 MLP
56 pages
DL Unit 1
No ratings yet
DL Unit 1
21 pages
Session 2 Introduction To Generative AI
No ratings yet
Session 2 Introduction To Generative AI
17 pages
References
No ratings yet
References
2 pages
Unit II
No ratings yet
Unit II
35 pages
.-111111 - Tti, N: Untvi '7,'lo) LLT' Ll''l.it
No ratings yet
.-111111 - Tti, N: Untvi '7,'lo) LLT' Ll''l.it
4 pages
AI31
No ratings yet
AI31
13 pages
Perceptron
No ratings yet
Perceptron
26 pages
2............... EFFResNet-ViT A Fusion-Based Convolutional and Vision Transformer Model For Explainable Medical Image Classification
No ratings yet
2............... EFFResNet-ViT A Fusion-Based Convolutional and Vision Transformer Model For Explainable Medical Image Classification
29 pages
1.1 - Intro To Ai - 1.1 Excite
No ratings yet
1.1 - Intro To Ai - 1.1 Excite
21 pages
TC4033 FinalQuiz 33
No ratings yet
TC4033 FinalQuiz 33
5 pages
07 Ais302 CNN
No ratings yet
07 Ais302 CNN
56 pages
NN DL
No ratings yet
NN DL
54 pages
Class11 MCQ
No ratings yet
Class11 MCQ
2 pages
8 Neural Networks
No ratings yet
8 Neural Networks
55 pages
Deep Learning With Python Mini Course
No ratings yet
Deep Learning With Python Mini Course
26 pages
Ee267 01
No ratings yet
Ee267 01
7 pages

MN906 AI Watermarking

Uploaded by

MN906 AI Watermarking

Uploaded by

Watermarking

deep neural networks

Attendance at NeurIPS 2022 (one of the top ML/DL conferences)

 Input signals coming from

 Accumulate these signals

 If the threshold is achieved, the

 In supervised learning we have a training set

• ANNs are typically structured in layers

 Minimize J = E + regularization term R(w)

 Usually λ some orders of magnitude lower than η

 Are you really introducing new information?

 BEWARE: if the task is not just

 Update parameters via error gradient descent

 Iterate on k+1-th sample Updated

 2011: SIFT descriptors, Fisher Vectors, SVM (XRCE)

 Key idea: 3x3 filters everywhere

 Non-strictly sequential data processing (Inception module)

Understanding and Implementing Architectures of ResNet

Visualizing and Understanding Deep Neural Networks by Matt Zeiler

 Nowadays many startups loan training time

Large custom Large custom

Convolutional layers FC layer

 Replace FC layer(s) with ad-hoc K-units FC layer

 Replace FC layer(s) with ad-hoc K-units FC layer

 Freeze (early) convolutional layers (η=0 or close to)

 Replace FC layer(s) with ad-hoc K-units FC layer

 Freeze (early) convolutional layers (η=0 or close to)

 Refine network over small custom dataset

Small custom Small custom

“Watermark should be robust against removal attacks.”

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

Prune sparsification) ->Degrades network performance

Han, S., Pool, J., Tran, J., & Dally, W. J. (2015).

You might also like