Variational Diffusion Unlearning
Variational Diffusion Unlearning
000
001
VARIATIONAL D IFFUSION U NLEARNING :
A VARIA -
002 TIONAL INFERENCE FRAMEWORK FOR UNLEARNING
003
004 IN DIFFUSION MODELS
005
006
Anonymous authors
007
Paper under double-blind review
008
009
010
011 A BSTRACT
012
013 For responsible and safe deployment of diffusion models in various domains, reg-
014 ulating the generated outputs from these models is desirable because such models
015 could generate undesired violent and obscene outputs. To tackle this problem,
016
one of the most popular methods is to use machine unlearning methodology to
forget training data points containing these undesired features from pre-trained
017
generative models. Thus, the principal objective of this work is to propose a ma-
018
chine unlearning methodology that can prevent the generation of outputs contain-
019 ing undesired features from a pre-trained diffusion model. Our method termed
020 as Variational Diffusion Unlearning (VDU) is a one-step method that only re-
021 quires access to a subset of training data containing undesired features to forget.
022 Our approach is inspired by the variational inference method that minimizes a
023 loss function consisting of two terms: plasticity inducer and stability regular-
024 izer. Plasticity inducer reduces the log-likelihood of the undesired training data
025 points while the stability regularizer, essential for preventing loss of image sample
026 quality, regularizes the model in parameter space. We validate the effectiveness
027
of our method through comprehensive experiments, by forgetting data of certain
user-defined classes from MNIST and CIFAR-10 datasets from a pre-trained un-
028
conditional denoising diffusion probabilistic model (DDPM).
029
030
031
1 I NTRODUCTION
032
033
034 (a) Original “1” (b) Unlearned “1” (c) Original “frog” (d) Unlearned “frog”
035
036
037
038
039
040
041
042
043
Figure 1: (a) and (c) show the original images generated by a pre-trained DDPM model on the
044 MNIST and CIFAR-10 datasets, respectively. (b) and (d) display the corresponding images gener-
045 ated after unlearning, using our method VDU. The same noise vectors used to generate the original
046 images were applied in the unlearned model to generate the unlearned images. VDU delivers good-
047 quality images after unlearning as .
048
049 In recent years, diffusion models (Ho et al., 2020; Song & Ermon, 2019; Song et al., 2021; Rom-
050 bach et al., 2022) have been popular for generating high-quality images which are useful for various
051 tasks such as image and video editing (Ceylan et al., 2023; Feng et al., 2024), text-to-image trans-
052 lation (Ramesh et al., 2021; 2022; Saharia et al., 2022) etc. As these models become more and
053 more widespread, there lies a requirement to train them on vast amounts of internet data for diverse
and robust output generation. However, there is also a potential downside to using such models, as
1
Under review as a conference paper at ICLR 2025
054
they often generate outputs containing biased, violent, and obscene features (Tommasi et al., 2017).
055 Thus, a safe and responsible generation from these models becomes an important requirement.
056
057 To address these challenges, recent works (Moon et al., 2024; Tiwary et al., 2023; Panda & A.P.,
058
2023; Gandikota et al., 2023; Schramowski et al., 2023; Heng & Soh, 2023) have proposed methods
for regulating the outputs of various generative models (e.g. VAEs (Kingma, 2013), GANs (Good-
059
fellow et al., 2020), and Diffusion models (Ho et al., 2020)) to ensure their safe and responsible
060
deployment, with machine unlearning emerging as a particularly important technique for control-
061 ling the safe generation of content from these models. The key idea of machine unlearning is to
062 develop a computationally efficient method to forget the subset of training data containing these
063 undesired features from this pre-trained model. Thus, ideally, the unlearned model should behave
064 like the retrained model — trained without the undesired data subset. However, achieving this goal
065 is challenging because, during the process of unlearning, the model’s generalization capacity gets
066 hurt making the quality of the generated outputs poor. This phenomenon is well studied in a similar
067 context also known as catastrophic forgetting (McCloskey & Cohen., 1989; Goodfellow et al., 2014;
068 Kirkpatrick et al., 2017; Ginart et al., 2019; Nguyen et al., 2020a) where plasticity to adapt to a new
069
task hurts the stability of the model to perform well on the older task it had been trained on. In this
case, the new task of unlearning an undesired subset of data hurts the quality of images generated
070
by the unlearned model thereafter. Thus, the scope of this research is concerned with the following
071
question:
072
073
074
Can we develop a machine unlearning algorithm that can forget an undesired subset of training
data from a pre-trained diffusion model without hurting the quality of the images it generates?
075
076
077 To answer this question, a recent unlearning work Selective Amnesia (SA) (Heng & Soh, 2023)
078 adopts a continual learning setup and proposes an unlearning method based on elastic weight con-
079
solidation (EWC) (Kirkpatrick et al., 2017) wherein a weight regularization strategy in the parameter
space is introduced to balance the unlearning task and retaining sample quality. In essence, the vari-
080
ation of parameters is penalized by computing the “importance”, to retain good performance, using
081
the Fisher Information Matrix (FIM). However, this unlearning method has two potential downsides:
082 (a) EWC formulation requires the calculation of the FIM which is expensive due to the gradient
083 product. (b) This unlearning method struggles to maintain the model’s performance, often leading
084 to the generation of low-quality samples when it relies solely on the unlearning data subset. To solve
085 the problem of low image quality, the authors employ generative replay to retrain the model with
086 generated samples from “non-unlearning” data subsets, which further enhances computational re-
087 quirements. A major challenge, however, lies in situations with partial access to the training data due
088 to rising concerns for data privacy and safety (Bae et al., 2018). In such a case, Selective Amnesia
089 (SA) performs poorly with no access to the non-unlearning data.
090 Taking note of such crucial observations, our research aims to develop a computationally efficient
091 algorithm for unlearning an undesired class of training data from a pre-trained unconditional De-
092 noising Diffusion Probabilistic Model (DDPM) (Ho et al., 2020). It is also important to mention
093 that our methodology only requires partial access to a subset of training data aimed at unlearning
094 because it is not always feasible to have access to the full training dataset (Chundawat et al., 2023b;
095 Panda et al., 2024). While such a realistic setup is challenging, we draw inspiration from works
096
on variational inference techniques (Knoblauch et al., 2022; Nguyen et al., 2018; Noel Loo, 2021;
Wild et al., 2022). We develop, Variational Diffusion Unlearning (VDU), a variational inference
097
framework in the parameter space to unlearn a subset of training data. This theoretical formulation
098
of the variation divergence yields a lower bound which is used as a loss function to fine-tune the
099 pre-trained model for unlearning. This loss consists of two terms: plasticity inducer and stability
100 regularizer. The plasticity inducer is used for adapting to the new task of reducing the log-likelihood
101 of the unlearning data while the stability regularizer prevents drastic changes in the pre-trained pa-
102 rameters of the model. These two proposed terms capture the persistent trade-offs that exist between
103 the quantity of unlearning required and maintaining initial image quality. Overall, our contributions
104 are summarized as follows:
105
106 • We propose a theoretical formulation of unlearning from a variational inference perspective
107 and propose a methodology to unlearn a certain class of training data from a pre-trained
unconditional denoising diffusion probabilistic model (DDPM) (Ho et al., 2020).
2
Under review as a conference paper at ICLR 2025
108
• To address the limitations of concurrent unlearning methods for diffusion models (Heng &
109 Soh, 2023), which stem from the computational complexity of FIM computation and the
110 need for generative replay with non-unlearned data, our proposed method is more efficient.
111 It achieves computational efficiency by fine-tuning the pre-trained model for only a few
112 epochs—sometimes as few as one—and is effective with fewer samples, requiring access
113 only to the unlearning data points, making it highly suitable for stricter unlearning scenarios
114 with limited access to the original training dataset.
115 • We validate our method on the MNIST (LeCun et al., 1998) and CIFAR-10 (Krizhevsky
116 et al., 2009) datasets, for unlearning different classes of data points using a pre-trained
117 DDPM.
118
119
120
2 R ELATED W ORKS
121
122
2.1 M ACHINE U NLEARNING FOR G ENERATIVE M ODELS
123
The core of machine unlearning (Cao & Yang, 2015; Xu et al., 2020; Nguyen et al., 2022; Bour-
124 toule et al., 2021) revolves around removing or forgetting a specific subset of training data from a
125 trained model, either due to rising privacy and security concerns (Bae et al., 2018), potential fairness
126 (Mehrabi et al., 2021), or as per a user’s request. A plausible approach is to retrain the model on
127 the training data devoid of the undesired training data subset. However, this can be computation-
128 ally very expensive concerning the scale of the model parameters and training data. To solve this
129 problem, different machine unlearning algorithms were proposed for different problem and model
130 settings such as for K-means (Ginart et al., 2019), random forests (Brophy & Lowd, 2021), linear
131 classification models (Guo et al., 2019; Golatkar et al., 2020a;b; Sekhari et al., 2021), neural net-
132
work based classifiers (Wu et al., 2020; Graves et al., 2021; Chundawat et al., 2023a; Panda et al.,
2024) etc.
133
134 However, machine unlearning is not only useful for the above supervised and unsupervised learning
135 scenarios but also necessary for generative settings. With the emergence of large pre-trained text-to-
136 image models (Rombach et al., 2022; Saharia et al., 2022), there is potential for misuse in generating
137 harmful or inappropriate content. Thus, controlling the outputs from these generative models be-
138
comes an utmost priority. To solve this problem, recent works (Sun et al., 2023; Tiwary et al., 2023;
Moon et al., 2024) proposed unlearning-based approaches for variational auto-encoders (VAEs) and
139
generative adversarial networks (GANs). Sun et al. (2023) proposed a cascaded unlearning method
140
using the idea of latent space substitution for a pre-trained StyleGAN under both settings of full and
141 partial access (similar to our setting) to the training dataset. Tiwary et al. (2023) proposed a two-
142 stage adapt and unlearn approach of first adapting a pre-trained StyleGAN to undesired samples and
143 then unlearning the model using the regularization in parameter space. Further to extend unlearning
144 for diffusion models, a recent work (Heng & Soh, 2023) adopts a continual learning setup and pro-
145 poses an unlearning method based on elastic weight consolidation (EWC) (Kirkpatrick et al., 2017)
146 for unlearning a pre-trained conditional DDPM (Ho et al., 2020).
147
148 2.2 VARIATIONAL I NFERENCE
149
150 To acquire exact inference from data, it is essential to calculate the exact posterior distribution. How-
151 ever, the exact posterior is often intractable and hard to calculate essentially making the inference
152 task challenging. To solve this problem, the domain of variational inference tries to approximate the
153
true posterior by a more tractable distribution from a class of distributions. Now to get the optimal
distribution, often termed as variational posterior, these methods (Sato, 2001; Broderick et al., 2013;
154
Blundell et al., 2015; Bui et al., 2016; Ghahramani & Attias, 2020) optimize the so-called evidence
155
lower bound (ELBO). These methodologies formulate the problem of inference in the parameter
156 or weight space which is often challenging because of the high dimension of the parameter space
157 and multi-modality of parameter posterior distribution. Thus to solve this problem, the recent line
158 of works (Ma et al., 2019; Sun et al., 2019; Rudner et al., 2020; Wild et al., 2022) try to do infer-
159 ence in the function space itself. These methods (Rudner et al., 2020; Wild et al., 2022) perform
160 inference by optimizing functional KL-divergence, minimizing the Wasserstein distance between
161 the functional prior and Gaussian process. Inspired by these works, we formulate our unlearning
methodology from a task of inference in parameter space by minimizing a variational divergence.
3
Under review as a conference paper at ICLR 2025
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
Figure 2: Variational Diffusion Unlearning (VDU): Given user-identified samples to be unlearned
179
(Df ), our unlearning method fine-tunes the initial pre-trained DDPM model with a loss function
180 having two terms: the first component is a Plasticity inducer (bottom half) aiming to minimize
181 the log-likelihood associated with the unlearned data points while the second one is a Stability
182 regularizer (upper half) aiming to retain the performance of the model.
183
184
185 3 M ETHODOLOGY
186
187 3.1 P ROBLEM F ORMULATION
188
Consider a pre-trained DDPM model, denoted as fθ∗ , with initial parameters θ∗ ∈ Θ ⊆ Rd . Θ
189
denotes the complete parameter space. This model has been trained on a specific training dataset D,
190
consisting of m i.i.d. samples {xi }mi=1 that are drawn from a distribution PX over the data space X ,
191 tries to learn the underlying data distribution PX . Based on the outputs of this model, the user wants
192 to unlearn a portion of the data space consisting of undesired features, referred to as XSf . Therefore,
193 the entire data space can be expressed as the union of Xr and Xf , where X = Xr Xf or Xr =
194 X \Xf . We denote the distributions over Xf and Xr as PXf and PXr , respectively. The objective
195 of the unlearning mechanism is to output a sanitized model θu that does not produce outputs within
196 the domain Xf . This implies that the model should be trained to generate data samples conforming
197 to the distribution PXr only. Assuming access to the whole training dataset, a computationally
198 expensive approach to achieving this is by retraining the entire model from scratch using a dataset
199
Dr = {xi }si=1 ∼ PXs r or equivalently, Dr = D \ Df , where Df = {xi }ti=1 ∼ PXt f . It is important
200
to notice that in our setting, we do not have access to Dr . Hence, the method of retraining becomes
infeasible.
201
202
203 3.2 M ETHOD OVERVIEW
204
Given the pre-trained DDPM model fθ∗ and unlearning data subset Df , the objective is to produce
205 an unlearned model θu so that it behaves like a retrained model θr trained on Dr . Inspired by some
206 previous works in Bayesian inference (Sato, 2001; Broderick et al., 2013; Blundell et al., 2015;
207 Ghahramani & Attias, 2020; Nguyen et al., 2020b), it can be seen that, retrained parameters θr are
208 a sample from the retrained model’s parameter posterior distribution P (θ|Dr ) i.e., θr ∼ P (θ|Dr ).
209 Similarly, the pre-trained model’s parameters θ∗ ∼ P (θ|Dr , Df ). Using this motivation, we try to
210 approximate the retrained model’s parameter posterior distribution P (θ|Dr ) as follows:
211
212
213 P (θ|Dr , Df ) ∝ P (Dr , Df |θ)P (θ) ∝ P (Df |θ)P (Dr |θ)P (θ) ∝ P (Df |θ)P (θ|Dr ) (1)
214
215 Eq. 1 is the direct consequence of Bayes’ rule ignoring the normalizing constant and assuming the
independence between Dr and Df . It can be seen from Eq. 1 that the posterior distribution P (θ|Dr )
4
Under review as a conference paper at ICLR 2025
216
is intractable and an approximation is required by forming proj(P (θ|Dr )) ≈ Q∗ (θ). Here, proj(·) is
217 a projection function that takes an intractable un-normalized distribution and maps it to a normalized
218 distribution. As previously mentioned in the literature (Broderick et al., 2013; Blundell et al., 2015;
219 Bui et al., 2016; Nguyen et al., 2018; Noel Loo, 2021; Knoblauch et al., 2022; Wild et al., 2022),
220 one can take several choices of projection functions such as Laplace’s approximation, variational
221 KL-divergence minimization, moment matching, and importance sampling. We adopt variational
222 KL-divergence minimization for our method, as prior research (Bui et al., 2016) has demonstrated
223 its superior performance over other inference techniques for complex models. Thus, our method is
224 defined through a variational KL-divergence minimization over set of probable approximate poste-
225
rior distribution Q as follows:
226
227 P (θ|Df , Dr )
Q∗ (θ) = argmin DKL Q(θ) Z · (2)
228 Q(θ)∈Q P (Df |θ)
229
230 Here, Z is the intractable normalization constant which is independent of the parameter θ. Finally,
Qd
231 if the variational prior distribution Q(θ) = i=1 N (θi , σi2 ) and the posterior distribution with full
Qd
232 data P (θ|Dr , Df ) = i=1 N (µ∗i , σi∗2 ) Eq. 2 results in the minimization of the below loss function
233 which we define as variational diffusion unlearning loss as follows:
234
235
236 T d
X X (1 − αt ) X (θi − µ∗i )2
237 LV DU (θ, θ∗ , Df ) = −(1 − γ) 2
||ϵ0 − ϵθ (xt , t)|| + γ
αt (1 − ᾱt−1 ) 2σi∗2
238 xt ∈Df t=1 i=1
| {z } | {z }
239 B
240
A
(3)
241
242 Algorithm 1 Variational Diffusion Unlearning (VDU)
243
Input: unlearning data: Df , initial parameter: θ∗ , no. of epochs: E, learning rate: η and, hyper-
244 parameter: γ
245
Initialize: θ ← θ∗
246 for t = 1 to E do !
247 (θi −µ∗ 2
(1−αt ) i)
P PT Pd
248 LV DU (θ, θ∗ , Df ) = −(1 − γ) t=1 αt (1−ᾱt−1 ) ||ϵ0 − ϵθ (xt , t)||2 +γ i=1 2σi∗2
xt ∈Df
249
θt+1 ← θt − η∇θ LV DU (θ, θ∗ , Df )
250 end for
251 Output: θE
252
253
Eq. 3 represents the proposed loss function used to optimize the pre-trained model during the un-
254
learning process. The loss comprises two key terms: A term referred to as the “plasticity inducer,”
255
minimizes the log-likelihood of the unlearning data, while B term serves as the “stability regular-
256 izer,” penalizing the model’s parameters to prevent them from deviating too much from their pre-
257 trained state during unlearning. To balance these two components, we introduce
258
Qt a hyper-parameter,
γ. {αt : t ∈ T } refers to the diffusion model’s noise scheduler where ᾱt = j=1 αj . ϵ0 is the true
259
added noise, ϵθ (xt , t) is the model predicted noise at time t and, d is the dimension of parameter.
260 Figure 2 is a detailed illustration of our framework and details the different loss components used
261 for the unlearning of Df .
262
263 3.3 T HEORETICAL O UTLOOK
264
265 In this section, we provide theoretical exposition for the derivation of the variational diffusion
266 unlearning loss LV DU (θ, θ∗ , Df ) in Eq. 3 from variational divergence minimization defined in
267 Eq. 2. Let’s {xt : t = 0, 1, . . . , T } denote latent variables with x0 denoting the true data. It
268 is important to mention that in the diffusion process, it is assumed that all transitional kernels
269 are first-order Markov. Now, in the forward diffusion process, the transition kernel is denoted as
QT
q(xt |xt−1 ) with the joint posterior distribution being q(x1:T |x0 ) = t=1 q(xt |xt−1 ) where each
5
Under review as a conference paper at ICLR 2025
270 √
q(xt |xt−1 ) = N (xt ; αt xt−1 , (1 − αt )I). Similarly, for the backward diffusion process, the tran-
271 QT
sitional kernel is denoted as p(xt−1 |xt ) with joint distribution p(x0:T ) = p(xT ) t=1 pθ (xt−1 |xt )
272
where, p(xT ) = N (xT ; 0, I). Thus after optimizing the diffusion model, the sampling procedure
273
is done by sampling Gaussian noise from p(xT ) and iteratively running the denoising transitions
274 pθ (xt−1 |xt ) for T steps to generate a new sample x0 .
275
276 Lemma 1 Assuming all the transition kernels to be Gaussian, the following holds:
277
278 I. q(xt−1 |xt , x0 ) = N (xt−1 ; µq (t), σq2 (t)I) with µq (t) = √1 xt − √ 1−α√t
ϵ
αt 1−ᾱt αt 0
279
280 II. pθ (xt−1 |xt ) = N (xt−1 ; µθ (t), σq2 (t)I) with µθ (t) = √1 xt − √ 1−α√t
ϵ (x , t)
αt 1−ᾱt αt θ t
281
(1−αt )(1−ᾱt−1 )
282 III. σq2 (t) = (1−ᾱt )
283
284 Lemma 2 The log-likelihood under the backward diffusion process kernel,
285
T
286 X
ln pθ (x0 ) ≳ − E [DKL (q(xt−1 |xt , x0 )||pθ (xt−1 |xt ))]
287 q(xt |x0 )
t=2
288
289
Partial derivation for the above two lemmas can be found in Luo (2022). For completeness, we have
290
added the full proof in Appendix 6.1.1 and 6.1.2.
291
292 Theorem 1 Assuming a Gaussian mean-field approximation in the parameter space i.e., if the vari-
293 Qd
ational prior distribution Q(θ) = i=1 N (θi , σi2 ) and the posterior distribution with full data
294 Qd
P (θ|Dr , Df ) = i=1 N (µ∗i , σi∗2 ) then,
295
296 T
P (θ|Df , Dr ) X X (1 − αt )
297 DKL Q(θ) Z · ≳− ||ϵ0 − ϵθ (xt , t)||2
298
P (Df |θ) αt (1 − ᾱt−1 )
xt ∈Df t=2
299
| {z }
300 I
d
(θi − µ∗ )2 σi2 σi∗
301 X
i 1
+ + ∗2 + log −
302
i=1
2σi∗2 2σi σi 2
303 | {z }
304 II
305
306 Proof 1 Here we give a sketch of the proof. For a detailed proof, please look into Appendix 6.1.3.
307 The KL-divergence term in Eq. 2 is expanded and segregated into two terms: E[log P (Df |θ)] and
308
DKL (Q(θ)||P (θ|Dr , Df ). The first term is approximated using Lemmas 1 and 2, while the second
term is expanded using the assumption of KL divergence between two Gaussian distributions.
309
310
Remark 1 In Theorem 1, the first term I appears in the first part of loss function LV DU (θ, θ∗ , Df )
311 Pd (θ −µ∗ )2
312 in Eq. 3. While if we assume σi = σi∗ term II turns out to be simply i=1 [ i2σ∗2i ], which is the
i
313 second component of our proposed loss function.
314
315
4 E XPERIMENTS AND R ESULTS
316
317
4.1 DATASETS AND M ODELS
318
319 The primary goal of our unlearning method is to stop the generation of undesired images from a
320 pre-trained DDPM model. We utilize two well-known datasets for our experiments: MNIST (LeCun
321 et al., 1998) and CIFAR-10 (Krizhevsky et al., 2009). Here, we use unconditional DDPM models for
322 our unlearning method. We use two different U-Net architectures for the DDPM model on MNIST
323 and CIFAR-10 respectively. These architectures are used from two open-source implementations
detailed in Appendix 6.2.1.
6
Under review as a conference paper at ICLR 2025
324
4.2 I NITIAL T RAINING , U NLEARNING AND BASELINE
325
326 • Initial Training: We train the unconditional DDPM model on MNIST for 40 epochs with a batch
327 size of 64 to obtain the pre-trained model, which achieves an FID (Heusel et al., 2017) score of 5.12.
328 Similarly, to obtain the pre-trained model for the CIFAR-10 dataset, we used a pre-trained check-
329 point from an open-source implementation and fine-tuned the model for a further 90,000 iterations
330 with a batch size of 128 to achieve an FID score of 7.96. A more detailed description of the initial
331 training setups can be found in Appendix 6.2.2.
332 • Unlearning: For MNIST, we unlearn the digit classes 0, 1, and 8. Similarly, for the CIFAR-10
333 dataset, we target the unlearning of specific classes: class 1 (automobile), class 6 (frog), and class
334 8 (ship). As can be seen from our method, we require the model parameter’s mean and variance,
335 so we have trained 5 models on each dataset to calculate µ∗i and σi∗ . Further experimental details of
336 our unlearning method on each dataset are added in Appendix 6.2.3.
337 • Baseline: For comparison, we have adapted the state-of-the-art Selective Amnesia (SA) (Heng &
338 Soh, 2023) as the baseline, as it is the most relevant to our approach. A detailed comparison can
339 be found in Table 1. This baseline method relies on a computationally expensive generative replay
340 technique, essentially retraining the model to preserve the quality of generated samples from the
341 unlearned model. In contrast, our approach eliminates the need for such retraining, offering a more
342 efficient alternative.
343
344 4.3 E VALUATION M ETRICS
345
346 To evaluate different unlearning methods for generative models, previous work (Tiwary et al., 2023)
347 proposed below metrics which are described as follows:
348
• Percentage of Unlearning (PUL): This metric measures how much unlearning has occurred by
349 comparing the reduction in the number of unwanted samples produced by the DDPM model af-
350 ter unlearning (θu ) with the number of such samples before unlearning (θ∗ ). The Percentage of
351 Unlearning (PUL) is calculated as:
352
353 (Dfg )θ∗ − (Dfg )θu
PUL = × 100%
354 (Dfg )θ∗
355
356 where (Dfg )θ∗ and (Dfg )θu represent the number of undesired samples generated by the original
357 DDPM and the unlearned DDPM, respectively. To calculate PUL, we generate 5,000 random sam-
358 ples from both DDPM models and use a pre-trained classifier to identify the unwanted samples.
359
• Unlearned Fréchet Inception Distance (u-FID): To evaluate that the unlearning model doesn’t
360 render the pre-trained model useless i.e., to quantify the quality of generated images by the un-
361 learned DDPM, we utilize the u-FID score. It is important to mention that this FID score is mea-
362 sured between the generated samples from the unlearned model and the real data only consisting
363 of non-unlearning data. Thus, to remove the unlearning data points from the real data we use a
364 pre-trained classifier. In this case, a lower u-FID score reflecting higher image quality indicates that
365 the unlearned model’s performance does not degrade on the non-unlearning data points.
366
367 4.4 E XPERIMENTAL R ESULTS
368
369 Table 1 shows the performance of our method compared to the Selective Amnesia (SA) baseline
370 for different class unlearning settings. It is observable that our method achieves lower u-FID scores
371 with superior to comparable PULs, offering a more favorable trade-off than the SA method. On
372 the MNIST dataset, our method outperforms the SA method (which was trained for 2 epochs) after
373 just 1 epoch of training. However, for CIFAR-10, the SA method achieved its best results with 2
374
epochs with poor sample quality (see Appendix 6.3.2), while our method required 4 epochs to reach
optimal performance and maintain good sample quality. Table 2 further shows the performance of
375
our method for different values of γ.
376
377 In Figure 3, we illustrate the visual performance of our method for different class unlearning scenar-
ios on both MNIST and CIFAR-10 datasets. Further visual results are illustrated in Appendix 6.3.
7
Under review as a conference paper at ICLR 2025
378
379 Table 1: Quantitative performance comparison on MNIST and Table 2: Unlearning performance
380 CIFAR-10 datasets. with different values of γ.
381
382 Selective Amnesia VDU (Our Method) VDU (Our Method)
Datasets Unlearned Classes Datasets and Classes γ
PUL(%) u-FID
PUL(%) u-FID PUL u-FID
383 0.1 75.06 14.43
Digit-0 77.47 121.61 61.00 29.33 0.3 72.15 14.33
MNIST Digit-1 15.01 301.25 75.06 14.42 MNIST Digit-1
384 0.6 55.21 11.54
Digit-8 48.95 161.08 68.96 38.20 0.8 71.67 13.12
385 Automobile 2.25 92.89 60.87 30.85 0.1 71.63 24.46
0.3 74.65 28.75
386 CIFAR-10 Frog 66.39 111.95 62.56 30.17 CIFAR-10 Ship
0.6 56.54 20.51
Ship 85.02 249.88 71.63 24.45 0.8 69.95 19.64
387
388
389
390 (a) Original (b) Digit “0” (c) Digit “1” (d) Digit “8”
391
MNIST
392
393
394
395
396
397
398
(e) Original (f)“Automobile” (g) “Frog” (h)“Ship”
399
CIFAR-10
400
401
402
403
404
405
406
407
Figure 3: Generated samples from the pre-trained DDPM model and different class unlearned mod-
408
els. (a),(b),(c), and (d) in the first row are generated using pre-trained and unlearned models re-
409
spectively trained on MNIST while (e),(f),(g), and (h) in the second row images are generated using
410 initial and unlearned models trained on CIFAR-10. Our method shows superior image quality after
411 unlearning.
412
413
414 5 C ONCLUSION , L IMITATIONS , AND F UTURE W ORKS
415
416 For the safe and responsible deployment of generative models, it is essential to regulate outputs
417 that contain undesired features. This work presents a machine unlearning methodology to prevent
418 the generation of undesired outputs from a pre-trained unconditional denoising diffusion probabilis-
419 tic model (DDPM) without accessing the whole training data. Our method termed as Variational
420 Diffusion Unlearning (VDU) presents a variational inference framework in parameter space to re-
421 duce undesired number of sample generation effectively with a lower computational cost. We show
422
the effectiveness of our method on different class unlearning settings for lower dimensional datasets
such as MNIST and CIFAR-10. Acknowledging limited experimental evidence of VDU only on
423
lower dimensional datasets, our current and future efforts are as follows:
424
425 • Future experiments: To show further experimental evidence for the effectiveness of our method,
426 we plan to test our method for high-dimensional datasets such as miniImageNet (Vinyals et al.,
427 2016) and CelebA (Liu et al., 2015).
428 • Theoretical Generalization: Our current theoretical framework leverages the parameter space to
429 exploit variational inference, but its scope is limited. Inspired by the idea of function space varia-
430 tional inference techniques (Ma et al., 2019; Rudner et al., 2020; Sun et al., 2019; Wild et al., 2022),
431 our current efforts also involve finding a superior variational inference framework for machine un-
learning.
8
Under review as a conference paper at ICLR 2025
432
R EFERENCES
433
434 Ho Bae, Jaehee Jang, Dahuin Jung, Hyemi Jang, Heonseok Ha, Hyungyu Lee, and Sungroh Yoon.
435 Security and privacy issues in deep learning. arXiv preprint arXiv:1807.11655, 2018.
436 Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. Weight uncertainty in
437 neural network. In International Conference on Machine Learning, 2015.
438
439 Lucas Bourtoule, Varun Chandrasekaran, Christopher A Choquette-Choo, Hengrui Jia, Adelin
440 Travers, Baiwu Zhang, David Lie, and Nicolas Papernot. Machine unlearning. In 2021 IEEE
441
Symposium on Security and Privacy (SP), pp. 141–159. IEEE, 2021.
442 Tamara Broderick, Nicholas Boyd, Andre Wibisono, Ashia C. Wilson, and Michael I. Jordan.
443 Streaming variational bayes. In Proc. of NeurIPS, 2013.
444
445 Jonathan Brophy and Daniel Lowd. Machine unlearning for random forests. In International Con-
446
ference on Machine Learning, pp. 1092–1104. PMLR, 2021.
447 Thang Bui, Daniel Hern´andez-Lobato, Jose Hernandez-Lobato, Yingzhen Li, and Richard Turner.
448 Deep gaussian processes for regression using approximate expectation propagation. In Interna-
449 tional Conference on Machine Learning, 2016.
450
Yinzhi Cao and Junfeng Yang. Towards making systems forget with machine unlearning. In 2015
451
IEEE symposium on security and privacy, pp. 463–480. IEEE, 2015.
452
453 Duygu Ceylan, Chun-Hao P. Huang, and Niloy J. Mitra. Pix2video: Video editing using image
454 diffusion. In Proc. of ICCV, 2023.
455
Vikram S Chundawat, Ayush K Tarun, Murari Mandal, and Mohan Kankanhalli. Can bad teaching
456
induce forgetting? unlearning in deep networks using an incompetent teacher. In Proc. of AAAI,
457 2023a.
458
459 Vikram S. Chundawat, Ayush K. Tarun, Murari Mandal, and Mohan Kankanhalli. Zero-shot ma-
460 chine unlearning. IEEE Transaction of Information Forensics and Security, 2023b.
461
Ruoyu Feng, Wenming Weng, Yanhui Wang, Yuhui Yuan, Jianmin Bao, Chong Luo, Zhibo Chen,
462 and Baining Guo. Ccedit: Creative and controllable video editing via diffusion models. In Proc.
463 of CVPR, 2024.
464
465 Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, and David Bau. Erasing concepts
466 from diffusion models. In Proc. of ICCV, 2023.
467 Zoubin Ghahramani and H. Attias. Online variational bayesian learning. Workshop on Online
468 Learning, NeurIPS, 2020.
469
470 Antonio Ginart, Melody Guan, Gregory Valiant, and James Y Zou. Making ai forget you: Data
471
deletion in machine learning. Advances in neural information processing systems, 32, 2019.
472 Aditya Golatkar, Alessandro Achille, and Stefano Soatto. Eternal sunshine of the spotless net:
473 Selective forgetting in deep networks. In Proc. of CVPR, 2020a.
474
475
Aditya Golatkar, Alessandro Achille, and Stefano Soatto. Forgetting outside the box: Scrubbing
deep networks of information accessible from input-output observations. In Proc. of ECCV,
476
2020b.
477
478 Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair,
479 Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of the
480 ACM, 63(11):139–144, 2020.
481
Ian J. Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, and Yoshua Bengio. An empirical
482
investigation of catastrophic forgetting in gradient-based neural network. In Proc. of International
483 Conference on Learning Representations, 2014.
484
485 Laura Graves, Vineel Nagisetty, and Vijay Ganesh. Amnesiac machine learning. In Proc. of AAAI,
2021.
9
Under review as a conference paper at ICLR 2025
486
Chuan Guo, Tom Goldstein, Awni Hannun, and Laurens Van Der Maaten. Certified data removal
487 from machine learning models. arXiv preprint arXiv:1911.03030, 2019.
488
489 Alvin Heng and Harold Soh. Selective amnesia: A continual learning approach to forgetting in deep
490 generative models. In Proc. of Neural Information Processing Systems, 2023.
491
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter.
492 Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in
493 neural information processing systems, 30, 2017.
494
495 Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Proc. of
496 NeuRIPS, 2020.
497 Diederik P Kingma. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
498
499 James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A
500 Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. Overcom-
501
ing catastrophic forgetting in neural networks. Proceedings of the national academy of sciences,
114(13):3521–3526, 2017.
502
503 Jeremias Knoblauch, Jack Jewson, and Theodoros Damoulas. An optimization-centric view on
504 bayes’ rule: Reviewing and generalizing variational inference. In Proc. of Journal of Machine
505 Learning Research, 2022.
506
Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images.
507
2009.
508
509 Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to
510 document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
511
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild.
512
In Proceedings of International Conference on Computer Vision (ICCV), 2015.
513
514 Calvin Luo. Understanding diffusion models: A unified perspective. arXiv pre-print, 2022.
515
C. Ma, Y. Li, and J. M. Hernández-Lobato. Variational implicit processes. In International Confer-
516
ence on Machine Learning, PMLR pages 4222–4233, 2019.
517
518 Michael McCloskey and Neal J. Cohen. Catastrophic interference in connectionist networks: The
519 sequential learning problem. Psychology of Learning and Motivation, 1989.
520
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. A survey
521
on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6):1–35, 2021.
522
523 Saemi Moon, Seunghyuk Cho, and Dongwoo Kim. Feature unlearning for pre-trained gans and
524 vaes. arXiv preprint, 2024.
525
Cuong V. Nguyen, Yingzhen Li, Thang D. Bui, and Richard E. Turner. Variational continual learn-
526
ing. In Proc. of ICLR, 2018.
527
528 Quoc Phong Nguyen, Bryan Kian Hsiang Low, and Patrick Jaillet. Variational bayesian unlearning.
529 In Proc. of NIPS, 2020a.
530
Quoc Phong Nguyen, Bryan Kian Hsiang Low, and Patrick Jaillet. Variational bayesian unlearning.
531
Advances in Neural Information Processing Systems, 33:16025–16036, 2020b.
532
533 Thanh Tam Nguyen, Thanh Trung Huynh, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin,
534 and Quoc Viet Hung Nguyen. A survey of machine unlearning. arXiv preprint arXiv:2209.02299,
535 2022.
536
Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models.
537 In International conference on machine learning, pp. 8162–8171. PMLR, 2021.
538
539 Richard E. Turner Noel Loo, Siddharth Swaroop. Generalized variational continual learning. In
Proc. of ICLR, 2021.
10
Under review as a conference paper at ICLR 2025
540
Subhodip Panda and Prathosh A.P. Fast: Feature aware similarity thresholding for weak unlearning
541 in black-box generative models. arXiv pre-print, 2023.
542
543 Subhodip Panda, Shashwat Sourav, and Prathosh A.P. Partially blinded unlearning: Class unlearning
544 for deep networks a bayesian perspective. arXiv pre-print, 2024.
545
546
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen,
and Ilya Sutskever. Zero-shot text-to-image generation. In International Conference on Machine
547
Learning, pp. 8821–8831. PMLR, 2021.
548
549 Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-
550 conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
551
552 Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-
553
resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF confer-
ence on computer vision and pattern recognition, pp. 10684–10695, 2022.
554
555 Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomed-
556 ical image segmentation. In Medical image computing and computer-assisted intervention–
557 MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceed-
558 ings, part III 18, pp. 234–241. Springer, 2015.
559
560
T. G. Rudner, Z. Chen, and Y. Gal. Rethinking function-space variational inference in bayesian
neural networks. In Third Symposium on Advances in Approximate Bayesian Inference, 2020.
561
562 Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar
563 Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. Photorealistic
564 text-to-image diffusion models with deep language understanding. Advances in neural informa-
565 tion processing systems, 35:36479–36494, 2022.
566
567 Masa-Aki Sato. Online model selection based on the variational bayes. In Proc. of Neural Compu-
568
tations, 2001.
569 Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion:
570 Mitigating inappropriate degeneration in diffusion models. In Proceedings of the IEEE/CVF
571 Conference on Computer Vision and Pattern Recognition, pp. 22522–22531, 2023.
572
573 Ayush Sekhari, Jayadev Acharya, Gautam Kamath, and Ananda Theertha Suresh. Remember what
574 you want to forget: Algorithms for machine unlearnings. In Proc. of NeurIPS, 2021.
575
Yan Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.
576 In Proc. of NeuRIPS, 2019.
577
578 Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben
579 Poole. Score-based generative modeling through stochastic differential equations. In Proc. of
580 ICLR, 2021.
581
Hui Sun, Tianqing Zhu, Wenhan Chang, and Wanlei Zhou. Generative adversarial networks un-
582
learning. arXiv preprint, 2023.
583
584 S. Sun, G. Zhang, J. Shi, and R. Grosse. Functional variational bayesian neural networks. arXiv
585 preprint arXiv:1903.05779, 2019.
586
587 Piyush Tiwary, Atri Guha, Subhodip Panda, and Prathosh A.P. Adapt then unlearn: Exploiting
588
parameter space semantics for unlearning in generative adversarial networks. arXiv pre-print,
2023.
589
590 Tatiana Tommasi, Novi Patricia, Barbara Caputo, and Tinne Tuytelaars. Advances in Computer
591 Vision and Pattern Recognition. Springer, 2017.
592
593 Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. Match-
ing networks for one shot learning. In Proc. of Neural Information Processing System, 2016.
11
Under review as a conference paper at ICLR 2025
594
Veit David Wild, Robert Hu, and Dino Sejdinovic. Generalized variational inference in function
595 spaces: Gaussian measures meet bayesian deep learning. In proc. of Neural Information Sytems,
596 2022.
597
598 Yinjun Wu, Edgar Dobriban, and Susan Davidson. Deltagrad: Rapid retraining of machine learning
599 models. In International Conference on Machine Learning, pp. 10355–10366. PMLR, 2020.
600 Heng Xu, Tianqing Zhu, Lefeng Zhang, Wanlei Zhou, and Philip S. Yu. Machine unlearning: A
601 survey. ACM Computing Surveys Vol. 56, No. 1, 2020.
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
12
Under review as a conference paper at ICLR 2025
648
6 A PPENDIX
649
650
6.1 T HEORETICAL P ROOFS
651
652 6.1.1 P ROOF OF L EMMA -1
653
654 Even though it is a three-part proof part-III comes from proof of part-I while part-II is proved via a
655 similar argument as proof of part-I. So here we will prove part-I in detail. We know:
656
657 √
658 q(xt |xt−1 , x0 ) = q(xt |xt−1 ) = (xt ; αt xt−1 , (1 − αt )I) (4)
659
660 Now, we can represent xt in terms of x0 by recursive re-parameterization as,
661 √ √
xt = αt xt−1 + 1 − αt ϵt−1 (5)
662 √
√ √ p
663 = αt αt−1 xt−2 + 1 − αt−1 ϵt−2 + 1 − αt ϵt−1 (6)
664 √ p √
665 = αt αt−1 xt−2 + αt − αt αt−1 ϵt−2 + 1 − αt ϵt−1 (7)
√
q
666 p 2 p
= αt αt−1 xt−2 + αt − αt αt−1 + (1 − αt )2 ϵt−2 (8)
667
√ p
668 = αt αt−1 xt−2 + αt − αt αt−1 + 1 − αt ϵt−2 (9)
669 √ p
= αt αt−1 xt−2 + 1 − αt αt−1 ϵt−2 (10)
670
671
..
. (11)
672 v v
u t u t
673 uY u Y
674 = t αi x0 + t1 − α i ϵ0 (12)
i=1 i=1
675 √
√
676 = ᾱt x0 + 1 − ᾱt ϵ0 (13)
677
√
∼ N (xt ; ᾱt x0 , (1 − ᾱt )I) (14)
678
(15)
679
680
Now, via Bayes’ rule,
681
682 q(xt |xt−1 , x0 )q(xt−1 |x0 )
683 q(xt−1 |xt , x0 ) = (16)
q(xt |x0 )
684 √ √
N (xt ; αt xt−1 , (1 − αt )I) N (xt−1 ; ᾱt−1 x0 , (1 − ᾱt−1 )I)
685 = √ (17)
686
N (xt ; ᾱt x0 , (1 − ᾱt )I)
√ √ √
(xt − αt xt−1 )2 (xt−1 − ᾱt−1 x0 )2 (xt − ᾱt x0 )2
687
∝ exp − + −
688 2(1 − αt ) 2(1 − ᾱt−1 ) 2(1 − ᾱt )
689 (18)
690 √ 2 √ 2 √ 2
1 (xt − αt xt−1 )
(xt−1 − ᾱt−1 x0 ) (xt − ᾱt x0 )
691 = exp − + −
2 1 − αt 1 − ᾱt−1 1 − ᾱt
692
(19)
693 √ 2 2 √
1 −2 αt xt xt−1 + αt xt−1 − 2 ᾱt−1 xt−1 x0
x
694 = exp − + t−1 + C(xt , x0 )
695 2 1 − αt 1 − ᾱt−1
696 (20)
√ 2 2 √
1 −2 αt xt xt−1
697 αt xt−1 xt−1 2 ᾱt−1 xt−1 x0
698
∝ exp − + + − (21)
2 1 − αt 1 − αt 1 − ᾱt−1 1 − ᾱt−1
699 √ √
1 αt 1 α t xt ᾱt−1 x0
700 = exp − + x2t−1 − 2 + xt−1
2 1 − αt 1 − ᾱt−1 1 − αt 1 − ᾱt−1
701
(22)
13
Under review as a conference paper at ICLR 2025
702
Upon further expansion of Eq. 22,
703
√ √
704 1 αt (1 − ᾱt−1 ) + 1 − αt 2 α t xt ᾱt−1 x0
705
= exp − xt−1 − 2 + xt−1 (23)
2 (1 − αt )(1 − ᾱt−1 ) 1 − αt 1 − ᾱt−1
706 √ √
1 αt − ᾱt + 1 − αt 2 αt xt ᾱt−1 x0
707 = exp − xt−1 − 2 + xt−1 (24)
2 (1 − αt )(1 − ᾱt−1 ) 1 − αt 1 − ᾱt−1
708 √ √
709 1 1 − ᾱt αt xt ᾱt−1 x0
= exp − x2t−1 − 2 + xt−1 (25)
710 2 (1 − αt )(1 − ᾱt−1 ) 1 − αt 1 − ᾱt−1
√ √
711 1 1 − ᾱt αt xt ᾱt−1 x0 1 − ᾱt
712
= exp − x2t−1 − 2 + xt−1
2 (1 − αt )(1 − ᾱt−1 ) 1 − αt 1 − ᾱt−1 (1 − αt )(1 − ᾱt−1 )
713 (26)
714 √ √
1 1 − ᾱt 2 α x
t t ᾱ x
t−1 0 (1 − αt )(1 − ᾱt−1 )
715 = exp − xt−1 − 2 + xt−1
2 (1 − αt )(1 − ᾱt−1 ) 1 − αt 1 − ᾱt−1 1 − ᾱt
716
(27)
717 √ √
− −
718 1 1 1 − ᾱt 2 α t (1 ᾱ )x
t−1 t + ᾱt−1 (1 α )x
t 0
= exp − · xt−1 − 2 xt−1
719 2 (1 − αt )(1 − ᾱt−1 ) 1 − ᾱt 1 − ᾱt
720 (28)
√ √
αt (1 − ᾱt−1 )xt + ᾱt−1 (1 − αt )x0 (1 − αt )(1 − ᾱt−1 )
721
722
∝ N xt−1 ; , I (29)
1 − ᾱt 1 − ᾱt
723
724 From Eq. 29 it can be seen that,
725
√ √
726 (1 − αt )(1 − ᾱt−1 ) αt (1 − ᾱt−1 )xt + ᾱt−1 (1 − αt )x0
σq2 (t) = µq (t) =
727 (1 − ᾱt ) 1 − ᾱt
728
729 √
xt − √1−ᾱt ϵ0
Now further substituting x0 = ᾱt
in the mean term µq (t), we get,
730
731
732 √ √
αt (1 − ᾱt−1 )xt + ᾱt−1 (1 − αt )x0
733 µq (t) = (30)
734
1 − ᾱt
√ √ √
1−
√ ᾱt ϵ0
735 αt (1 − ᾱt−1 )xt + ᾱt−1 (1 − αt ) xt − ᾱt
736 = (31)
1 − ᾱt
737 √ √
1−
√ ᾱt ϵ0
738 αt (1 − ᾱt−1 )xt + (1 − αt ) xt − αt
739 = (32)
1 − ᾱt
740 √ √
αt (1 − ᾱt−1 )xt (1 − αt )xt (1 − αt ) 1 − ᾱt ϵ0
741 = + √ − √ (33)
742
1 − ᾱt (1 − ᾱt ) αt (1 − ᾱt ) αt
√ √
αt (1 − ᾱt−1 )
743 1 − αt (1 − αt ) 1 − ᾱt
= + √ xt − √ ϵ0 (34)
744 1 − ᾱt (1 − ᾱt ) αt (1 − ᾱt ) αt
745 √
αt (1 − ᾱt−1 ) 1 − αt (1 − αt ) 1 − ᾱt
746 = √ + √ xt − √ ϵ0 (35)
(1 − ᾱt ) αt (1 − ᾱt ) αt αt
747 √
748 αt − ᾱt + 1 − αt (1 − αt ) 1 − ᾱt
= √ xt − √ ϵ0 (36)
749 (1 − ᾱt ) αt αt
√
750 1 − ᾱt (1 − αt ) 1 − ᾱt
751 = √ xt − √ ϵ0 (37)
(1 − ᾱt ) αt αt
752 √
1 (1 − αt ) 1 − ᾱt
753 = √ xt − √ ϵ0 (38)
754 αt αt
755 √
(1−αt ) 1−ᾱt
Similarly, µθ (t) = √1 xt − √ ϵθ (xt , t)
αt αt
14
Under review as a conference paper at ICLR 2025
756
6.1.2 P ROOF OF L EMMA -2
757
758 Let x0 denote the true data. Thus, to increase the log-likelihood of the data we maximize the
759 evidence lower bound as follows:
760 Z
761 ln p(x0 ) = ln p(x0:T ) dx1:T (39)
762 Z
p(x0:T )
763 = ln q(x1:T |x0 ) dx1:T (40)
764 q(x1:T |x0 )
765 p(x0:T )
= ln E (41)
766 q(x1:T |x0 ) q(x1:T |x0 )
767 (a)
p(x0:T )
768 ≥ E ln (42)
q(x1:T |x0 ) q(x1:T |x0 )
769 " QT #
770 p(xT ) t=1 pθ (xt−1 |xt )
= E ln QT (43)
771
t=1 q(xt |xt−1 )
q(x1:T |x0 )
772 " QT #
773 p(xT )pθ (x0 |x1 ) t=2 pθ (xt−1 |xt )
= E ln QT (44)
774 q(x1:T |x0 ) q(x1 |x0 ) t=2 q(xt |xt−1 )
775 " QT #
(b) p(xT )pθ (x0 |x1 ) t=2 pθ (xt−1 |xt )
776 = E ln QT (45)
777 q(x1:T |x0 ) q(x1 |x0 ) t=2 q(xt |xt−1 , x0 )
778
" T
#
pθ (xT )pθ (x0 |x1 ) Y pθ (xt−1 |xt )
779 = E ln + ln (46)
780
q(x1:T |x0 ) q(x1 |x0 ) t=2
q(xt |xt−1 , x0 )
781 T
(c)
ln p(xT )pθ (x0 |x1 ) + ln pθ (xt−1 |xt )
Y
782 = E (47)
q(x1:T |x0 ) q(x1 |x0 ) q(xt−1 |xt ,x0 )q(xt |x0 )
783 t=2 q(xt−1 |x0 )
784 " T
#
785 p(xT )pθ (x0 |x1 ) q(x1 |x0 ) Y pθ (xt−1 |xt )
= E ln + ln + ln (48)
786 q(x1:T |x0 ) q(x1 |x0 ) q(xT |x0 ) t=2
q(xt−1 |xt , x0 )
787 " T
#
788
p(xT )pθ (x0 |x1 ) X pθ (xt−1 |xt )
= E ln + ln (49)
789 q(x1:T |x0 ) q(xT |x0 ) t=2
q(xt − 1|xt , x0 )
790 X T
p(xT ) pθ (xt−1 |xt )
791 = E [ln pθ (x0 |x1 )] + E ln + E ln
792 q(x1:T |x0 ) q(x1:T |x0 ) q(xT |x0 ) t=2
q(x1:T |x0 ) q(xt−1 |xt , x0 )
793 (50)
794 X T
(d) p(xT ) pθ (xt−1 |xt )
795 = E [ln pθ (x0 |x1 )] + E ln + E ln
796
q(x1 |x0 ) q(xT |x0 ) q(xT |x0 ) t=2
q(xt ,xt−1 |x0 ) q(xt−1 |xt , x0 )
797 (51)
798 (e)
= E [ln pθ (x0 |x1 )] − DKL (q(xT |x0 )||p(xT )) (52)
799 q(x1 |x0 )
800 T
X
801 − E [DKL (q(xt−1 |xt , x0 )||pθ (xt−1 |xt ))] (53)
802 q(xt |x0 )
t=2
803 T
(f ) X
804 ≈ − E [DKL (q(xt−1 |xt , x0 )||pθ (xt−1 |xt ))] (54)
q(xt |x0 )
805 t=2
806
807 Here (a) holds via Jensen’s inequality as log is a concave function. (b) is true because additional con-
808 ditioning doesn’t affect the first-order Markovian transitional kernel q(.|.). (c) and (e) are achieved
809 via Bayes’ rule while (d) is true via marginalization property. As the first two terms in Eq. 52 are
insignificant compared to the last term, so (f ) holds.
15
Under review as a conference paper at ICLR 2025
810
6.1.3 P ROOF OF T HEOREM -1
811
812 The variational divergence term in Eq. 2 is:
813
814
P (θ|Df , Dr )
Q(θ)P (Df |θ)
815 DKL Q(θ) Z · = E ln (55)
P (Df |θ) Q(θ) Z · P (θ|Df , Dr )
816
817
(g) Q(θ)
≡ E ln + E [ln P (Df |θ)] (56)
818 Q(θ) P (θ|Df , Dr ) Q(θ)
819
(h) Q(θ) X
820 = E ln + E ln P (x0 |θ) (57)
821 Q(θ) P (θ|Df , Dr ) Q(θ)
x0 ∈Df
822
823
X
= DKL (Q(θ)||P (θ|Df , Dr )) + E ln P (x0 |θ)
824 | {z } θ∼Q(θ) x ∈D
0 f
825 A | {z }
826 B
827 (58)
828 (g) holds as the normalization constant is independent of θ. (h) is true because of the i.i.d. assump-
829 tion on the data. Now, using the lemma presented below, we further expand the terms A and B in
830 Eq. 58.
831
Lemma 3 The Kullback-Leibler divergence for two multivariate normal distributions is given by:
832
833
1 |Σy | −1 T −1
DKL (N (x; µx , Σx )∥N (y; µy , Σy )) = log − d + tr(Σy Σx ) + (µy − µx ) Σy (µy − µx )
834 2 |Σx |
835
Proof 2 This proof can be found in any standard information theory textbook thus avoided here.
836
837 Using the above Lemma 3, the KL-divergence in Lemma 2 can be written using Lemma 1 for
1−αt
838 q(xt−1 |xt , x0 ) = N (xt−1 ; µq (t), σq2 (t)I) with µq (t) = √1αt xt − √1− √ ϵ and pθ (xt−1 |xt ) =
ᾱ α 0t t
839 1−αt (1−αt )(1−ᾱt−1 )
N (xt−1 ; µθ (t), σq2 (t)I) with µθ (t) = √1αt xt − √1− √ ϵ (x , t) where σq2 (t) =
ᾱt αt θ t (1−ᾱt )
840
as follows:
841
842
DKL (q(xt−1 |xt , x0 ) pθ (xt−1 |xt )) = DKL N (xt−1 ; µq , Σq (t)) N (xt−1 ; µθ , Σq (t)) (59)
2
843 1 1 1 − αt 1 1 − αt
844
= 2
√ xt − √ √ ϵθ (xt , t) − √ xt + √ √ ϵ0
2σq (t) αt 1 − ᾱt αt αt 1 − ᾱt αt 2
845 (60)
846 2
847 1 1 − αt 1 − αt
= √ √ ϵ0 − √ √ ϵθ (xt , t)
848 2σq2 (t) 1 − ᾱt αt 1 − ᾱt αt 2
849 (61)
850 2
1 1 − αt
851 = √ √ (ϵ0 − ϵθ (xt , t)) (62)
2σq2 (t) 1 − ᾱt αt 2
852
853 (1 − αt )2 2
= ∥ϵ0 − ϵθ (xt , t)∥2 (63)
854 2σq2 (t)(1 − ᾱt )αt
855 (1 − αt ) 2
856 = ∥ϵ0 − ϵθ (xt , t)∥2 (64)
αt (1 − ᾱt−1 )
857
Now, the term A in Eq. 58 using Lemma 3 with the parameter prior distribution Q(θ) =
858 Qd 2
Qd ∗ ∗2
859 i=1 N (θi , σi ) and the posterior distribution with full data P (θ|Dr , Df ) = i=1 N (µi , σi )
becomes:
860
861
d
σi∗ σi2 + (θi − µ∗i )2
862 X 1
863 DKL (Q(θ)||P (θ|Df , Dr )) = ln + − (65)
i=1
σi 2σi∗2 2
16
Under review as a conference paper at ICLR 2025
864
Building on the theoretical derivations above, the second term B in Eq. 58 can now be expressed
865 using Monte Carlo estimation as follows:
866
867 N
X 1 X X
868 E ln P (x0 |θ) ≈ ln P (x0 |θm ) (66)
θ∼Q(θ) N m=1
869 x0 ∈Df x0 ∈Df
870
(i) 1 X N X X T
871 ≳ − E [DKL (q(xt−1 |xt , x0 )||pθ (xt−1 |xt ))]
872 N m=1 q(xt |x0 )
x0 ∈Df
t=2
873 (67)
874
T
875 (j) X X (1 − αt )
≈− ||ϵ0 − ϵθ (xt , t)||2 (68)
876 αt (1 − ᾱt−1 )
xt ∈Df t=2
877
878 (i) holds as a consequence of Lemma 2. Now using a crude estimate of N = 1 the (j) holds by
879 equation 64. Finally, incorporating both A and B terms using Eq. 65 and Eq. 68 respectively we get,
880
881
882
T
P (θ|Df , Dr ) X X (1 − αt )
883 DKL Q(θ) Z · ≳− ||ϵ0 − ϵθ (xt , t)||2
P (Df |θ) αt (1 − ᾱt−1 )
884 xt ∈Df t=2
| {z }
885 A
886 d
(θi − µ∗i )2 σi2 σi∗
X 1
887 + + + log −
888 i=1
2σi∗2 2σi∗2 σi 2
| {z }
889 B
890
891 6.2 I MPLEMENTATION D ETAILS
892
893 6.2.1 DATASETS AND M ODELS
894
895 • MNIST: The MNIST dataset consist of 28×28 grayscale representing handwritten dig-
896
its from 0 to 9. The MNIST dataset contains 60,000 training images and 10,000 testing
images. We have used the same architectural model detailed in the open-source implemen-
897
tation: https://github.com/explainingai-code/DDPM-Pytorch (Ron-
898
neberger et al., 2015) for the MNIST dataset.
899
900 • CIFAR-10: CIFAR10 consists of 60,000 32×32 color images distributed across 10 classes
901 with 6000 images in each class. We adopt an unconditional DDPM model based on the
902
approach by (Nichol & Dhariwal, 2021), utilizing their official implementation provided in
https://github.com/openai/improved-diffusion
903
904
905
6.2.2 I NITIAL T RAINING
906 • MNIST: For the pre-training of the DDPM model on MNIST, we adopt the hyperparame-
907 ters from the open source code base mentioned above. Specifically, a randomly initialized
908 DDPM model is pre-trained on the train set of MNIST and optimized using a learning
909 rate of 10−4 for 40 epochs with a batch size of 64. The diffusion process follows a noise
910 scheduling strategy where the noise variance α1 at t = 1 is set to 0.0001, and it linearly
911 increases to αT = 0.02 at the final time-step T=1000. We have trained 5 models to get the
912 model mean µ∗i and variance σi∗ parameter.
913 • CIFAR-10: We use a unconditional DDPM checkpoint, pre-trained on CIFAR-10,
914 from https://github.com/openai/improved-diffusion. We fine-tune this
915 model on the train set of CIFAR-10 for 90k iterations, with a batch size of 128, and a
916 learning rate of 10−5 . The total diffusion steps T is set to 4000 with a cosine noise schedul-
917 ing strategy. Here We have trained 4 models to get the model mean µ∗i and variance σi∗
parameters. For the pre-training of both DDPM models, we use Adam optimizer.
17
Under review as a conference paper at ICLR 2025
918
6.2.3 U NLEARNING
919
920 After pre-training the DDPM models on their respective datasets, as described earlier, we now out-
921 line the unlearning process for each. We optimize using our proposed loss function, as defined in
922 Eq. 3, ensuring effective feature removal while maintaining model performance as,
923
1. MNIST: The model is optimized for unlearning over only 1 epoch using the Adam opti-
924
mizer with a learning rate of 10−6 and a batch size of 128. We set the total timesteps T to
925
1000, with the noise scheduler parameters α1 = 0.0001 and αT = 0.02.
926
927
2. CIFAR10: Similar optimization parameters are used as for MNIST, but the model is trained
over only 4 epochs. Here, we adjust the total timesteps to T = 500, with α1 = 0.0002 and
928
αT = 0.04.
929
930 For all of the benchmark datasets, we select and report the results for γ ∈ {0.1, 0.3, 0.6, 0.8}.
931
932 6.3 A DDITIONAL V ISUAL R ESULTS
933
934 6.3.1 U NLEARNED S AMPLES USING VDU
935
936
937 (a) Original (b) Digit “0” (c) Digit “1” (d) Digit “8”
938
MNIST
939
940
941
942
943
944
945
946
(e) Original (f)“Automobile” (g) “Frog” (h)“Ship”
947
CIFAR-10
948
949
950
951
952
953
954
955
Figure 4: Generated samples from the pre-trained DDPM model and different class unlearned mod-
els. (a) is generated using the pre-trained model (for the digit “0”), while (b), (c), and (d) are
956
generated from the unlearned models, each trained on MNIST, respectively. (e) is generated using
957
the pre-trained model (for the “automobile”), while (f), (g), and (h) are generated from the unlearned
958 models, each trained on CIFAR-10, respectively. Our method shows superior image quality after un-
959 learning.
960
961
962
963
964
965
966
967
968
969
970
971
18
Under review as a conference paper at ICLR 2025
972
6.3.2 G ENERATED SAMPLES USING S ELECTIVE A MNESIA (SA)
973
974
975 (a) ”Automobile” (b) ”Frog”
976
977
978
979
980
CIFAR-10
981
982
983
984
985
986
987
988
989
990
991
Figure 5: Generated samples from the unlearned model using SA method on CIFAR-10. This
method shows poor image quality after unlearning for a few epochs.
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
19