0% found this document useful (0 votes)
88 views

CSPL 391 PDF

The document discusses using wavelet transforms for image denoising. It explores several thresholding techniques for wavelet-based denoising such as SUREShrink, VisuShrink, and BayesShrink. These techniques involve applying a wavelet transform, thresholding the coefficients, and applying the inverse wavelet transform to remove noise while preserving image details. The document also examines using a Gaussian-based model for combined denoising and compression of natural images and compares the performance of different denoising methods.

Uploaded by

zeeshan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views

CSPL 391 PDF

The document discusses using wavelet transforms for image denoising. It explores several thresholding techniques for wavelet-based denoising such as SUREShrink, VisuShrink, and BayesShrink. These techniques involve applying a wavelet transform, thresholding the coefficients, and applying the inverse wavelet transform to remove noise while preserving image details. The document also examines using a Gaussian-based model for combined denoising and compression of natural images and compares the performance of different denoising methods.

Uploaded by

zeeshan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Image Denoising Using Wavelets

— Wavelets & Time Frequency —

Raghuram Rangarajan
Ramji Venkataramanan
Siddharth Shah

December 16, 2002


Abstract

Wavelet transforms enable us to represent signals with a high degree of sparsity. This is the principle
behind a non-linear wavelet based signal estimation technique known as wavelet denoising. In this report we
explore wavelet denoising of images using several thresholding techniques such as SUREShrink, VisuShrink
and BayesShrink. Further, we use a Gaussian based model to perform combined denoising and compression
for natural images and compare the performance of these methods.
Contents
1 Background and Motivation 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 The concept of denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Thresholding 4
2.1 Motivation for Wavelet thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Hard and soft thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Threshold determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Comparison with Universal threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Image Denoising using Thresholding 5


3.1 Introduction: Revisiting the underlying principle . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 VisuShrink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 SureShrink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.1 What is SURE ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.2 Threshold Selection in Sparse Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.3 SURE applied to image denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 BayesShrink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4.1 Parameter Estimation to determine the Threshold . . . . . . . . . . . . . . . . . . . . 9

4 Denoising and Compression using Gaussian-based MMSE Estimation 9


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Denoising using MMSE estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.3 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Conclusions 12
3

“If you painted a picture with a sky, be confused with smoothing; smoothing only removes
clouds, trees, and flowers, you would use the high frequencies and retains the lower ones.
a different size brush depending on the Wavelet shrinkage is a non-linear process and is
size of the features.Wavelets are like those what distinguishes it from entire linear denoising
brushes.” technique such as least squares. As will be explained
later, wavelet shrinkage depends heavily on the choice
-Ingrid Daubechies of a thresholding parameter and the choice of this
threshold determines, to a great extent the efficacy of
denoising. Researchers have developed various tech-
1 Background and Motivation niques for choosing denoising parameters and so far
there is no “best” universal threshold determination
1.1 Introduction technique.
The aim of this project was to study various
From a historical point of view, wavelet analysis is
thresholding techniques such as SUREShrink [1], Vis-
a new method, though its mathematical underpin-
uShrink [3] and BayeShrink [5] and determine the best
nings date back to the work of Joseph Fourier in the
one for image denoising. In the course of the project,
nineteenth century. Fourier laid the foundations with
we also aimed to use wavelet denoising as a means of
his theories of frequency analysis, which proved to be
compression and were successfully able to implement
enormously important and influential. The attention
a compression technique based on a unified denoising
of researchers gradually turned from frequency-based
and compression principle.
analysis to scale-based analysis when it started to be-
come clear that an approach measuring average fluc-
tuations at different scales might prove less sensitive
to noise. The first recorded mention of what we now 1.2 The concept of denoising
call a ”wavelet” seems to be in 1909, in a thesis by
A more precise explanation of the wavelet denoising
Alfred Haar.
procedure can be given as follows. Assume that the
In the late nineteen-eighties, when Daubechies and observed data is
Mallat first explored and popularized the ideas of
wavelet transforms, skeptics described this new field
X(t) = S(t) + N (t)
as contributing additional useful tools to a growing
toolbox of transforms. One particular wavelet tech-
nique, wavelet denoising, has been hailed as “offering where S(t) is the uncorrupted −1
signal with additive
all that we may desire of a technique from optimal- noise N(t). Let W (·) and W (·) denote the forward
ity to generality” [6]. The inquiring skeptic, how- and inverse wavelet transform operators.. Let D(·, λ)
ever maybe reluctant to accept these claims based on denote the denoising operator with threshold λ. We
asymptotic theory without looking at real-world ev- intend to denoise X(t) to recover Ŝ(t) as an estimate
idence. Fortunately, there is an increasing amount of S(t). The procedure can be summarized in three
of literature now addressing these concerns that help steps
us appraise of the utility of wavelet shrinkage more
realistically.
Wavelet denoising attempts to remove the noise Y = W (X)
present in the signal while preserving the signal char- Z = D(Y, λ)
acteristics, regardless of its frequency content. It in- Ŝ = W −1 (Z)
volves three steps: a linear forward wavelet trans-
form, nonlinear thresholding step and a linear in- D(·, λ) being the thresholding operator and λ being
verse wavelet transform.Wavelet denoising must not the threshold.
4

Noisy Signal in Time Domain (Original signal is Superimposed) 10 6


8 25

4
20
6 6

4
15 2
4
2

10
0 0
2

5 −2

−2
0 −4
0
−6
−4
−2
−5 −8

−10 −6
−4 −10 −10 −8 −6 −4 −2 0 2 4 6 8 10 −10 −8 −6 −4 −2 0 2 4 6 8 10
0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500

Figure 1: A noisy signal Figure 2: The same sig- Figure 3: Hard Thresh- Figure 4: Soft Thresh-
in time domain. nal in wavelet domain. olding. olding.
Note the sparsity of co-
efficients.
2.2 Hard and soft thresholding
Hard and soft thresholding with threshold λ are de-
2 Thresholding fined as follows
The hard thresholding operator is defined as
2.1 Motivation for Wavelet threshold-
ing D(U, λ) = U for all|U | > λ
= 0 otherwise
The plot of wavelet coefficients in Fig 2 suggests that
small coefficients are dominated by noise, while coef- The soft thresholding operator on the other hand is
ficients with a large absolute value carry more signal defined as
information than noise. Replacing noisy coefficients (
small coefficients below a certain threshold value) by D(U, λ) = sgn(U )max(0, |U | − λ)
zero and an inverse wavelet transform may lead to
a reconstruction that has lesser noise. Stated more Hard threshold is a “keep or kill” procedure and
precisely, we are motivated to this thresholding idea is more intuitively appealing. The transfer function
based on the following assumptions: of the same is shown in Fig 3. The alternative, soft
thresholding (whose transfer function is shown in Fig
4 ), shrinks coefficients above the threshold in abso-
• The decorrelating property of a wavelet trans-
lute value. While at first sight hard thresholding may
form creates a sparse signal: most untouched
seem to be natural, the continuity of soft threshold-
coefficients are zero or close to zero.
ing has some advantages. It makes algorithms math-
ematically more tractable [3]. Moreover, hard thresh-
• Noise is spread out equally along all coefficients.
olding does not even work with some algorithms such
as the GCV procedure [4]. Sometimes, pure noise co-
• The noise level is not too high so that we can efficients may pass the hard threshold and appear
distinguish the signal wavelet coefficients from as annoying ’blips’ in the output. Soft thesholding
the noisy ones. shrinks these false structures.

As it turns out, this method is indeed effective and


2.3 Threshold determination
thresholding is a simple and efficient method for noise
reduction. Further, inserting zeros creates more spar- As one may observe, threshold determination is an
sity in the wavelet domain and here we see a link be- important question when denoising. A small thresh-
tween wavelet denoising and compression which has old may yield a result close to the input, but the
been described in sources such as [5]. result may still be noisy. A large threshold on the
5

other hand, produces a signal with a large number Blocks Bumps


of zero coefficients. This leads to a smooth signal. Hard Soft Hard Soft
Paying too much attention to smoothness, however, Haar 1.2 1.6 Haar 1.2 1.6
destroys details and in image processing may cause Db2 1.2 1.6 Db2 1.4 1.6
blur and artifacts. Db4 1.2 1.6 Db4 1.4 1.6
To investigate the effect of threshold selection, Db8 1.2 1.8 Db8 1.4 1.8
we performed wavelet denoising using hard and soft
thresholds on four signals popular in wavelet litera- Heavy Sine Doppler
ture: Blocks, Bumps, Doppler and Heavisine[2]. Hard Soft Hard Soft
The setup is as follows: Haar 1.4 1.6 Haar 1.6 2.2
Db2 1.4 1.6 Db2 1.6 1.6
• The original signals have length 2048.
Db4 1.4 1.6 Db4 1.6 2.0
• We step through the thresholds from 0 to 5 with Db8 1.4 1.6 Db8 1.6 2.2
steps of 0.2 and at each step denoise the four
noisy signals by both hard and soft thresholding Table 1: Best thresholds, empirically found with differ-
with that threshold. ent denoising schemes, in terms of MSE

• For each threshold, the MSE of the denoised sig-


nal is calculated.
the universal threshold may give a better estimate
• Repeat the above steps for different orthogonal for the soft threshold if the number of samples are
bases, namely, Haar, Daubechies 2,4 and 8. larger (since the threshold is optimal in the asymp-
totic sense).
The results are tabulated in the table 1

2.4 Comparison with Universal 3 Image Denoising using


threshold Thresholding

The threshold λU N IV = 2lnN σ (N being the signal
length, σ 2 being the noise variance) is well known in 3.1 Introduction: Revisiting the un-
wavelet literature as the Universal threshold. It is derlying principle
the optimal threshold in the asymptotic sense and
minimises the cost function of the difference between An image is often corrupted by noise in its acquisition
the function and the soft thresholded version of the or transmission. The underlying concept of denoising
same in the L2 norm sense(i.e. it minimizes E k in images is similar to the 1D case. The goal is to
YT hresh − YOrig . k2 ). In our case, N=2048, σ = 1, remove the noise while retaining the important signal
therefore theoretically, features as much as possible.
p The noisy image is represented as a two-
λU N IV = 2ln(2048)(1) = 3.905 (1) dimensional matrix {xij }, i, j = 1, · · · , N. The noisy
version of the image is modelled as
As seen from the table, the best empirical thresh-
olds for both hard and soft thresholding are much yij = xij + nij i, j = 1, · · · , N.
lower than this value, independent of the wavelet
used. It therefore seems that the universal thresh- where {nij } are iid as N(0,σ 2 ). We can use the same
old is not useful to determine a threshold. However, principles of thresholding and shrinkage to achieve
it is useful for obtain a starting value when nothing is denoising as in 1-D signals. The problem again boils
known of the signal condition. One can surmise that down to finding an optimal threshold such that the
6

1.1
Blocks

soft
mean squared error between the signal and its esti-
1
hard
mate is minimized.
0.9 The wavelet decomposition of an image is done as
0.8
follows: In the first level of decomposition, the im-
0.7
age is split into 4 subbands,namely the HH,HL,LH
and LL subbands. The HH subband gives the diag-
MSE

0.6
onal details of the image;the HL subband gives the
horizontal features while the LH subband represent
0.5

0.4
the vertical structures. The LL subband is the low
0.3
resolution residual consisiting of low frequency com-
0.2 ponents and it is this subband which is further split
0.1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
at higher levels of decomposition.
The different methods for denoising we investigate
threshold

Bumps
1.1
soft
hard
differ only in the selection of the threshold. The basic
1
procedure remains the same :
0.9

0.8 • Calculate the DWT of the image.


0.7

• Threshold the wavelet coefficients.(Threshold


MSE

0.6

0.5
may be universal or subband adaptive)
0.4
• Compute the IDWT to get the denoised esti-
0.3
mate.
0.2

0.1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Soft thresholding is used for all the algorithms due
threshold

Heavisine
to the following reasons: Soft thresholding has been
1.1
soft
hard
shown to achieve near minimax rate over a large num-
1
ber of Besov spaces[3]. Moreover, it is also found to
0.9 yield visually more pleasing images. Hard threshold-
0.8 ing is found to introduce artifacts in the recovered
0.7 images.
We now study three thresholding techniques- Vis-
MSE

0.6

0.5
uShrink,SureShrink and BayesShrink and investigate
0.4
their performance for denoising various standard im-
ages.
0.3

0.2

0.1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
3.2 VisuShrink
threshold

1.4
Doppler
Visushrink is thresholding by applying the Univer-
soft
hard sal threshold proposed by Donoho
√ and Johnstone [2].
1.2
This threshold is given by σ 2logM where σ is the
1
noise variance and M is the number of pixels in the
image.It is proved in [2] that the maximum of any M
0.8 values iid as N(0,σ 2 )will be smaller than the univer-
MSE

sal threshold with high probability, with the proba-


0.6
bility approaching 1 as M increases.Thus, with high
0.4

0.2

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
threshold

Figure 5: MSE V/s Threshold values for the four test


signals.
7

probabilty, a pure noise signal is estimated as being


identically zero.
However, for denoising images, Visushrink is found
to yield an overly smoothed estimate as seen in Fig-
ure 6. This is because the universal threshold(UT)
is derived under the constraint that with high prob-
ability the estimate should be at least as smooth as
the signal. So the UT tends to be high for large val-
ues of M, killing many signal coefficients along with
the noise. Thus, the threshold does not adapt well to
discontinuities in the signal.
(a) 512 × 512 image of ‘Lena’

3.3 SureShrink
3.3.1 What is SURE ?
Let µ = (µi : i = 1, . . . d) be a length-d vector, and
let x = {xi } (with xi distributed as N(µi ,1)) be multi-
variate normal observations with mean vector µ. Let
µ
b=µ b(x) be an fixed estimate of µ based on the obser-
vations x. SURE (Stein’s unbiased Risk Estimator)
is a method for estimating the loss kb µ − µk2 in an
unbiased fashion.
In our case µ b is the soft threshold estimator
(b) Noisy version of ‘Lena’ (t)
µ
bi (x) = ηt (xi ). We apply Stein’s result[1] to get
an unbiased estimate of the risk Ekb µ(t) (x) − µk2 :

d
X
SU RE(t; x) = d−2·#{i : |xi | < T }+ min(|xi |, t)2 .
i=1
(2)
For an observed vector x(in our problem, x is
the set of noisy wavelet coefficients in a subband),
we want to find the threshold tS that minimizes
SURE(t;x),i.e

(c) Denoised using Hard Thresh- tS = argmint SU RE(t; x). (3)


olding
The above optimization problem is computation-
ally straightforward. Without loss of generality, we
can reorder x in order of increasing |xi |.Then on inter-
vals of t that lie between two values of |xi |, SURE(t)
is strictly increasing. Therefore the minimum value
of tS is one of the data values |xi |. There are only
d values and the threshold can be obtained using
O(d log(d)) computations.

(d) Denoised using Soft Thresh-


olding

Figure 6: Denoising using VisuShrink


8

3.3.2 Threshold Selection in Sparse Cases 3.4 BayesShrink


The SURE principle has a drawback in situations of In BayesShrink [5] we determine the threshold for
extreme sparsity of the wavelet coefficients. In such each subband assuming a Generalized Gaussian
cases the noise contributed to the SURE profile by the Distribution(GGD) . The GGD is given by
many coordinates at which the signal is zero swamps
the information contributed to the SURE profile by
the few coordinates where the signal is nonzero. Con- GGσX ,β (x) = C(σX , β)exp−[α(σX , β)|x|]β (6)
sequently, SureShrink uses a Hybrid scheme.
The idea behind this hybrid scheme is that the −∞ < x < ∞, β > 0, where
losses
√ while using an universal threshold, tF d =
−1 Γ(3/β) 1/2
2 log d, tend to be larger than SURE for dense sit- α(σX , β) = σX [ Γ(1/β) ]
uations, but much smaller for sparse cases.So the
threshold is set to tF S and
d in dense situations and to t
in sparse situations. Thus the estimator in the hy- β·α(σX ,β)
brid method works as follows C(σX , β) = 1
2Γ( β )

R∞
½ and Γ(t) = 0 e−u ut−1 du.
ηtFd (xi ) s2d ≤ γd
bx (x)i =
µ (4) The parameter σX is the standard deviation and β
ηtS (xi ) s2d > γd ,
is the shape parameter It has been observed[5] that
where with a shape parameter β ranging from 0.5 to 1, we
can describe the the distribution of coefficients in a
P 2 3/2 subband for a large set of natural images.Assuming
i (xi − 1) log2 (d)
s2d = γd = √ (5) such a distribution for the wavelet coefficients, we em-
d d pirically estimate β and σX for each subband and try
to find the threshold T which minimizes the Bayesian
η being the thresholding operator.
Risk, i.e, the expected value of the mean square error.

3.3.3 SURE applied to image denoising b − X)2 = EX EY |X (X


b − X)2
τ (T ) = E(X (7)
We first obtain the wavelet decomposition of the
where X b = ηT (Y ), Y |X ∼ N (x, σ 2 ) and X ∼
noisy image. The SURE threshold is determined
for each subband using (2) and (3). We choose be- GGX ,β . The optimal threshold T ∗ is then given by
tween this threshold and the universal threshold us-
T ∗ (σx , β) = arg min τ (T ) (8)
ing (4).The expressions s2d and γd in (5), given for T
σ = 1 have to suitably modified according to the
noise variance σ and the variance of the coefficients This is a function of the parameters σX and β. Since
in the subband. there is no closed form solution for T ∗ , numerical
calculation is used to find its value.
The results obtained for the image ’Lena’ (512×512
It is observed that the threshold value set by
pixels) using SureShrink are shown in Figure 7(c).
The ‘Db4’ wavelet was used with 4 levels of decom- σ2
position. Clearly, the results are much better than TB (σX ) = (9)
σX
VisuShrink. The sharp features of the image are
retained and the MSE is considerably lower. This is very close to T ∗ .
because SureShrink is subband adaptive- a separate The estimated threshold TB = σ 2 /σX is not only
threshold is computed for each detail subband. nearly optimal but also has an intuitive appeal. The
9

normalized threshold, TB /σ. is inversely propor- To summarize,Bayes Shrink performs soft-


tional to σ, the standard deviation of X, and pro- thresholding, with the data-driven, subband-
portional to σX , the noise standard deviation. When dependent threshold,
σ/σX ¿ 1, the signal is much stronger than the noise,
Tb /σ is chosen to be small in order to preserve most
of the signal and remove some of the noise; when b2
σ
TbB (b
σX ) = .
σ/σX À 1, the noise dominates and the normalized σ
bX
threshold is chosen to be large to remove the noise
which has overwhelmed the signal. Thus, this thresh- The results obtained by BayesShrink for the image
old choice adapts to both the signal and the noise ’Lena’ (512 × 512 pixels) is shown in figure 7(d).The
characteristics as reflected in the parameters σ and ’Db4’ wavelet was used with four levels of decompo-
σX . sition. We found that BayesShrink performs better
than SureShrink in terms of MSE. The reconstruction
3.4.1 Parameter Estimation to determine using BayesShrink is smoother and more visually ap-
the Threshold pealing than the one obtained using SureShrink. This
The GGD parameters, σX and β, need to be esti- not only validates the approximation of the wavelet
mated to compute TB (σX ) . The noise variance σ 2 coefficients to the GGD but also justifies the approx-
is estimated from the subband HH1 by the robust imation to the threshold to a value independent of β.
median estimator[5],
M edian(|Yij |)
σ
b= , Yij ∈ subbandHH1 (10)
0.6745
The parameter β does not explicitly enter into the 4 Denoising and Compression
expression of TB (σX ). Therefore it suffices to esti- using Gaussian-based MMSE
mate directly the signal standard deviation σX . The
observation model is Y = X + V , with X and V in- Estimation
dependent of each other, hence

σY2 = σX
2
+ σ2 (11)
4.1 Introduction

where σY2 is the variance of Y. Since Y is modelled The philosophy of compression is that a signal typ-
as zero-mean, σY2 can be found empirically by ically has structural redundancies that can be ex-
ploited to yield a concise representation.White noise,
n
1 X 2 however does not have correlation and is not easily
bY2 =
σ Y (12)
n i,j=1 ij compressible. Hence, a good compression method
can provide a suitable method for distinguishing be-
where n × n is the size of the subband under consid- tween signal and noise.So far,we have investigated
eration. Thus wavelet thresholding techniques such as SureShrink
b2
σ and BayesShrink for denoising.We now use MMSE
TbB (b
σX ) = (13)
σ
bX estimation based on a Gaussian prior and show
where q that significant denoising can be achieved using this
σ
bX = max(b σY2 − σb2 , 0) (14) method. We then perform compression of the de-
noised coefficients based on their distribution and
In the case that σ b2 ≥ σ bY2 , σ
bX is taken to be find that this can be done without introducing sig-
b σX ) is ∞, or, in practice,TbB (b
zero, i.e, TB (b σX ) = nificant quantization error. Thus, we achieve simul-
max(|Yij |), and all coefficients are set to zero. taneous denoising and compression.
10

4.2 Denoising using MMSE estima-


tion
As explained in the previous section,the Generalized
Gaussian distribution (GGD) is a good model for
the distribution of wavelet coefficients in each detail
subband of the image. However, for most images,a
Gaussian distribution is found to be a satisfatory ap-
proximation. Therefore, the model for the ith detail
subband becomes

(a) 512 × 512 image of ‘Lena’ Yji = Xji + Nji j = 1, 2, · · · , Mi . (15)

where Mi is the number of wavelet coefficients in the


ith detail subband.The coefficients {Xji } are inde-
2
pendent and identically distributed as N (0, σX i ) and
i
are independent of {Nj }, which are iid draws from
N (0, σ 2 ). We want to get the best estimate of {Xji }
based on the noisy observations {Yji }.This is done
through the following steps:
1. The noise variance σ 2 is estimated as described
in the previous section.
2. The variance σY2 i is calculated as
(b) Noisy version of ‘Lena’
M
1 Xi i 2
σ̂Y2 i = Y
n2 j=1 j

3. σ̂X for the subband i is estimated as before as


q
σ̂X i = max(σ̂Y2 i − σ̂ 2 , 0).

This comes about because

σ̂Y2 i = σ̂X
2
i + σ̂
2

and in the case that σ̂ 2 ≥ σ̂Y2 i , σ̂X i is taken


(c) Denoised using SureShrink
to be zero. This means that the noise is more
dominant than the signal in the subband and so
the signal cannot be estimated with the noisy
observations.
4. Based on (15),the MMSE estimate of Xji based
on observing Yji is
2
σ̂X i
X̂ji = E[X/Y ] = · Yji (16)
σ̂Y2 i

(d) Denoised using BayesShrink

Figure 7: Denoising by BayesShrink and


SureShrink(σ = 30)
11

We observe the similarity of this step to wavelet 3. Each coefficient Aj is encoded using the quan-
shrinkage, since each coefficient Yji is brought tizer with the least M so that (Aj − Âj )2 ≤ D.
closer to zero in absolute value by multiplying Note that both D and the quantizer levels, de-
2
σ̂X
with i
(< 1). This effect is similar to that of fined for N (0, 1) have to scaled by σAj for each
σ̂ 2 i
Y coefficient Aj .
wavelet shrinkage in soft thresholding.
4. Steps 2 and 3 are repeated for all the coefficients
Steps 2 through 4 are repeated for each Aj in a subband and for all the detail subbands.
detail subband i. Note that the coefficients
in the low resolution LL subband are kept 5. The coefficents in the low resolution subband are
unaltered. quantized assuming a uniform distribution [5].
This is motivated by the fact that the LL coeffi-
The results obtained using this method for the cients are essentially local averages of the image
’Elaine’ image with a Db4 wavelet with 4 levels are and are not characterized by a Gaussian distri-
shown in the first three parts of Figure 8.The MSE bution.
comparison plot in Figure 9 shows that denoising
by Gaussian estimation performs slightly better than 4.4 Results
SureShrink for the ’Clock’ image. The slightly infe-
Figure 8 shows the results obtained when this de-
rior performance to BayesShrink is to be expected
noising and compression scheme is applied to the
since a GGD prior is a more exact representation of
image ’Elaine’ with σ = 30.We used Db-4 discrete
the wavelet coefficients in a subband than the Gaus-
wavelet series with 4 levels of decomposition.We see
sian prior.
the denoised version has much lower MSE (143.7 vs
σ 2 = 900)and better visual quality too. The com-
4.3 Compression pressed version looks very similar to the denoised
image with an additional MSE of around 20. It has
We now introduce a quantization scheme for a been encoded using 1.52 bpp (distortion value D set
concise representation of the denoised coefficients at=0.1). The rate can be controlled by changing the
{X̂ji }. From (16), the {X̂ji } are iid with distribu- distortion level D. If we fix a large distortion level D,
4
tion N (0,
σ̂X i
). The number of bits used to encode we get a low encoding rate, but have a price to pay-
σ̂ 2 i
Y larger quantization error. We choose to operate at
each coefficient X̂ji is determined as follows. For sim- a particular point on the ’Rate v Distortion’ curve
plicity of notation , we denote X̂ji as Aj , keeping in based on the distortion we are prepared to tolerate.
mind that Aj is a part of subband i The performance of the different denoising schemes
is compared in Figure 9. A 200 × 200 image ’Clock’
1. We first fix the maximum allowable distortion, is considered and the MSEs for different values of σ
say D, for each coefficient. are compared. Clearly, VisuShrink is the least ef-
fective among the methods compared. This is due
2. The variance of each coefficient Aj is found to the fact that it is based on a Universal threshold
empirically by calculating the variance of a 3 × 3 and not subband adaptive unlike the other schemes.
block of coefficients centered at Aj . Among these, BayesShrink clearly performs the best.
This is expected since the GGD models the distribu-
It is assumed that we have available a fi- tion of coefficients in a subband well. MMSE esti-
nite set of optimal Lloyd Max quantizers for mation based on a Gaussian distribution performs
the N (0, 1) distribution. In our experiments, we slightly worse than BayesShrink. We also see that
took 5 quantizers with number of quantization a quantization error(approximately constant) is in-
levels M = 2, 4, 8, 16 and 32. troduced due to compression. Among the subband
12

MSE comparsions of different thresholding methods for the image:Clock


400
20
BayesShrink
SureShrink
40
VisuShrink:Soft
350 MMSE Estimation
60 Quantization version

80
300
100

120 250

140

MSE
200
160

180
150
200
20 40 60 80 100 120 140 160 180 200

(a) 200 × 200 image of ‘Elaine’ 100

50
20

40 0
5 10 15 20 25 30
60 sigma

80

100
Figure 9: Comparison of MSE of various denoising
120
schemes
140

160

180
adaptive schemes, SureShrink has the highest MSE.
200
20 40 60 80 100 120 140 160 180 200 But it should be noted that SureShrink has the de-
(b) Noisy version of ‘Elaine’ sirable property of adapting to the discontinuities in
the signal. This is more evident in 1-D signals such
as ’Blocks’ than in images.
Denoised Elaine with estimation with wavelet db4 # levels=4

5 Conclusions
We have seen that wavelet thresholding is an ef-
fective method of denoising noisy signals. We first
tested hard and soft on noisy versions of the stan-
dard 1-D signals and found the best threshold.We
then investigated many soft thresholding schemes
viz.VisuShrink, SureShrink and BayesShrink for de-
(c) Denoised version of ‘Elaine’
noising images. We found that subband adaptive
thresholding performs better than a universal thresh-
Quantized version of Denoised Elaine olding. Among these, BayesShrink gave the best re-
sults. This validates the assumption that the GGD
is a very good model for the wavelet coefficient dis-
tribution in a subband. By weakening the GGD as-
sumption and taking the coefficients to be Gaussian
distributed, we obtained a simple model that facili-
tated both denoising and compression.
An important point to note is that although

(d) Quantized image of ‘Elaine’

Figure 8: MMSE Denoising and Quantization


13

SureShrink performed worse than BayesShrink and


Gaussian based MMSE denoising, it adapts well to
sharp discontinuities in the signal. This was not ev-
ident in the natural images we used for testing. It
would be instructive to compare the performance of
these algorithms on artificial images with disconti-
nuities (such as medical images). It would also be
interesting to try denoising (and compression) using
other special cases of the GGD such as the Laplacian
(GGD with β = 1).Most images can be described
with a GGD with shape parameter β ranging from
0.5 to 1. So a Laplacian prior may give better results
than a Gaussian prior (β = 2) although it may not
be as easy to work with.

References
[1] Iain M.Johnstone David L Donoho. Adapting
to smoothness via wavelet shrinkage. Journal
of the Statistical Association, 90(432):1200–1224,
Dec 1995.

[2] David L Donoho. Ideal spatial adaptation by


wavelet shrinkage. Biometrika, 81(3):425–455,
August 1994.
[3] David L Donoho. De-noising by soft threshold-
ing. IEEE Transactions on Information Theory,
41(3):613–627, May 1995.

[4] Maarten Jansen. Noise Reduction by Wavelet


Thresholding, volume 161. Springer Verlag,
United States of America, 1 edition, 2001.
[5] Martin Vetterli S Grace Chang, Bin Yu. Adap-
tive wavelet thresholding for image denoising and
compression. IEEE Transactions on Image Pro-
cessing, 9(9):1532–1546, Sep 2000.

[6] Carl Taswell. The what, how and why of wavelet


shrinkage denoising. Computing in Science and
Engneering, pages 12–19, May/June 2000.

You might also like