Robust Statistics For Image Deconvolution: A B C B
Robust Statistics For Image Deconvolution: A B C B
21218 USA.
c Space Telescope Science Institute, Baltimore, MD, 21218 USA.
Abstract
We present a blind multiframe image-deconvolution method based on robust
statistics. The usual shortcomings of iterative optimization of the likelihood
function are alleviated by minimizing the M -scale of the residuals, which achieves
more uniform convergence across the image. We focus on the deconvolution of
astronomical images, which are among the most challenging due to their huge
dynamic ranges and the frequent presence of large noise-dominated regions in
the images. We show that high-quality image reconstruction is possible even in
super-resolution and without the use of traditional regularization terms. Using
a robust ρ-function is straightforward to implement in a streaming setting and,
hence our method is applicable to the large volumes of astronomy images. The
power of our method is demonstrated on observations from the Sloan Digital
Sky Survey (Stripe 82) and we briefly discuss the feasibility of a pipeline based
on Graphical Processing Units for the next generation of telescope surveys.
1. Introduction
In the new era of astronomy surveys, dedicated telescopes observe the sky
every night to strategically map the celestial sources. The next-generation sur-
veys are capable of such high speed that repeated observations become possible,
opening a new window for research to systematically study changes over time.
The key requirement for time-domain astronomy is the development of sophisti-
cated algorithms that can maximize the information we gain from the data. The
image processing approaches we present in this paper are motivated by these
astronomical challenges but are not specific to such exposures and are expected
to work for long-range photography regardless of the content of the images.
Algorithmically deblurring single telescope images has had a long history.
At first during the 1970s, when Richardson [23] and Lucy [19] independently
introduced a Poisson-based iterative deconvolution method, generally known as
2
Figure 1: Shown above is an example of Sloan’s Stripe 82 observations, displaying the prob-
lematic features found in these images. Top shows an observation with low signal-to-noise and
a large PSF, Bottom same section of the SDSS Coadd Annis et al. [2], showing improvement
in signal-to-noise and definition of sources over the plain observation.
and combine information across all observed frames and produce a few sharper
images of higher resolution, exposing sources and features previously hidden in
blur and noise. True reconstruction is computationally complex and therefore
slow and laborious. In order to keep up with the growing volume of data, we
need fast, statistically sound tools to explore and deblur these images in near
real time.
For the purpose of this paper we demonstrate the application of our method
on the Sloan Digital Sky Survey (SDSS) [30], which over a span of 7 years
imaged a large part of the southern hemisphere with roughly 80 fold coverage.
This area known as Stripe 82 [1] is an ideal testbed for demonstrating the power
of our new methodology. The overall quality of Stripe 82 images varies widely,
see Fig. (1). Some images are blurry, some are faint and noisy, but most are a
combination of those. Annis et al. [2] produced a state-of-the-art coadd of this
region, which we use as a reference in our experiments. Due to the extremely
high dynamic-range we encourage the reader to view the included images on a
computer screen, as print materials have difficulty reproducing the full range of
contrast.
In this paper, we present a novel approach for extracting information from
atmospherically distorted repeated observations in order to produce a deblurred,
low-noise, higher resolution image. In section 2, we discuss previous work as well
as the general approach. In section 3 we introduce our robust improvements. In
section 4 we discuss the application, results and performance of our method and
3
finally, in 5 with conclude with future work and a summary of our achievements.
yt = ft ∗ x + t . (1)
Our goal is to fit this model to all observations in time {yt }. Going forward
we will make the differentiation between the underlying true image, x, and its
estimate, x̃, likewise we will distinguish between the observation, yt , and our
reconstruction, ỹt . Considering that a convolution is a linear operation, we
can represent the 1D convolution with f by a matrix F , such that F x = f ∗x,
this also holds true for the equivalent 2D convolution. Hereafter we use this
convention to representing matrices with capital letters and vectors with lower
case symbols.
The literature features a variety of methods for deblurring with a known
point-spread function (PSF). Especially common are methods which maximize
the Poisson likelihood, such as the Richardson-Lucy [19, 23, 24] deconvolution
or the more noise resilient damped-Richardson-Lucy [28]. There are also blind
methods, e.g., Fish et al. [9], which do not require a known PSF, but instead
solve for it as part of the iteration. This is a crucial feature as in most applica-
tions, PSFs are not inherently known.
While the noise in CCD observations follows a Poisson distribution, cf. the
number of electrons in CCDs proportional to the number of photons in each
pixel, its applicability is often limited because creating a complete model for an
image is not practical due to many known and unknown contributions (gradi-
ents from moon, thin clouds, etc.) The image processing pipelines correct and
calibrate the input images. Estimates of the sky background are subtracted, and
while noise estimates for each pixel remain accessible, the transformed pixel val-
ues will have different noise properties. Fortunately the background counts in
typical images are high enough that a Gaussian likelihood is a good approxima-
tion to the likelihood function. Our approach originates in Bayesian statistics
but full inference is computationally too expensive and hence we resort to maxi-
mum likelihood estimation (MLE). The multi-frame blind deconvolution method
described by Harmeling et al. [10, 11] uses a quadratic cost function, which cor-
responds to the Gaussian limit. The resulting formula is similar to that of the
Richardson-Lucy. We can solve for the model image, x̃, which minimizes the
4
residuals in all pixels,
X 2
x̃ = arg min [y − F x] . (2)
x
pixels
5
minimized is an L2 norm, therefore featuring a squared term. Extreme outliers
in the residual, especially early on in the estimation before the PSF has been
well formed, can overtake the residual, forcing convergence in those areas before
the rest of the image. This often leads to PSFs more resembling of a Dirac-δ
than a realistic PSF.
To curb the impact of those very large values, we borrow elements of robust
statistics and modify our quadratic cost function to be an M -estimator with a
robust ρ-function [20]. The solution is still iterative in nature and hence lends
itself well to our method. We adjust our cost function with the ρ-function,
X
x̃ = arg min ρ (y − F x) . (5)
x
pixels
where the weight of each pixel w is computed for its residual based on the
current solution of plugging x̃ into W(r) = ρ0 (r)/r function [20],
In particular we apply
the bisquare function, also known as the Tukey bi-
weight, W(r) = min 3−3r2 +r4 , 1/r2 , which corresponds to a ρ -function that
is quadratic for low values but approaches a constant for large arguments, es-
sentially limiting the contribution of the largest residuals in the cost function.
Figure 2 illustrates the difference between the terms in our robust cost function.
For more details we refer the interested reader to the discussion of the bisquare
scale in [20].
The threshold at which the ρ-function deviates from being quadratic, is
tuned by scaling the residuals, which therefore controls where the dampening
starts. In order to relate this tuning parameter to the quality of the image, it
is natural to define this scaling in units of σ. This controlled dampening of the
residuals greatly increases the quality of the estimated PSFs and therefore also
the resulting image. More detailed results are presented in section 4.4.
3.2. Convergence
The addition of the weighting as described in the previous section helps con-
strain the PSF estimation and therefore the areas around objects. On the other
hand, the noise-dominated areas between objects remain under-constrained, re-
sulting in the occurrence of background artifacts.
The cause of these artifacts had been a long standing problem for this
method, until we noticed that as the deconvolution approaches convergence,
the useful and reasonable updates to our model image become smaller, more
sporadic and clustered around sources. Conversely, the updates in the regions
6
Figure 2: ρ-function associated with the bisquare family of functions. Note the dotted line
indicates the contribution of a purely quadratic function.
between sources, where values are already low and close to zero, can become ex-
treme due to noise present in the observed image. For example, if a background
pixel in our model is near zero, but noise in the observed image wants to push
the value higher, update factors of 1000x or more are not uncommon. In most
cases the resulting pixel value will still be very small and will fluctuate around
zero, cancel each other out. Unfortunately in some scenarios these persist as
artifacts, ie. bright speckles across the noise dominated areas of the image.
To prevent these updates from affecting the model image we introduce an
update clipping function,
where d > 1, which limits the maximum impact an update is allowed to have
on any single pixel. The closer the parameter d is to 1, the higher the impact
and therefore the more conservative the updates will be. When d is large, the
clipping has virtually no impact. This approach vastly cuts down the number
of background artifacts.
Limiting updates in such a way has no real drawback on the dynamic range
in practice. For d = 2 the contrast becomes 240 with only 20 iterations.
In fig. 3 we show an update image clipped with d = 2.0 (right) and the
corresponding current model estimate (left). Observe how the gray areas, where
the update is between 0.5 and 2.0, fall directly around the regions where objects
are located in the current estimate. The areas that either appear pure white or
black are where the update was originally above 2.0 or below 0.5 and therefore
have been clipped respectively, meaning we only allow each pixel to be either
doubled or halved with any update. The resulting effect of this clipping can be
7
;
Figure 3: A typical update (right) to our image model (left) contains the most reasonable
updates clustered around objects. The values between objects, the background, are already
near zero, so any noise or non-uniform background subtraction can produce extreme and
unreasonable update values. Note: the coloring of the update image is such that the areas
that fall below the bottom clipping are colored black and the areas that are above our clipping
range are colored white. The gray areas are values which we consider to be reasonable.
8
Figure 4: Updates to our model image fluctuate most in areas between sources, without
dampening these updates, speckles (left) get introduced into the noise-dominated background
and faint sources can get disrupted by extreme erroneous updates. By limiting the absolute
magnitude of these updates we get much more coherent result (right).
seen in fig. 4, the left image shows some typical artifacts, a speckled background
in noise dominated areas, as well as the adverse effects of extreme erroneous
updates to faint objects. On the right, is the result of an identical deconvolution
with the Update Clipping enabled.
4. Application
In the following sections we describe the more traditional components of our
method, as well as the overall algorithm. We present the results of our method
when applied to real data and also quickly discuss our technical implementation
and it’s performance.
9
along the mask’s edges, which over multiple iterations can creep into the PSF
and begin corrupting the model. To prevent these artifacts, we taper the masks
towards zero around the edges and any masked out object, therefore smoothing
out these artificial hard edges.
10
the current and previous estimate, and a relative, which ensures we stop once the
maximum relative per-pixel change drops below 0.01%. We stop the convergence
once either of these criterions are violated. Once we have an acceptable solution,
we compute the update for the image model, while in turn holding the PSF
constant. The image model is only updated once per iteration, again we apply
our Robust Statistics weighting and our Update Clipping to prevent extreme
updates from having adverse effects on our model image. Since the resolved
PSFs can be wildly different between observations, the only piece of information
carried from one observation to the next is the image model.
11
Figure 5: unmodified multiframe blind deconvolution (MFBD) (top-left), MFBD with Update
Clipping (top-right), MFBD with robust statistics weighting (bottom-left), MFBD with robust
statistics weighting and update clipping (bottom-right). The clipping removes much of the
background noise, but does not have a large effect on the PSF. Robust Statistics weighting
provides a much improved PSF and the reduction of noise around objects.
softened using a 15 pixel Gaussian blur preventing edge artifacts. The next
ingredient is our robust statistics weighting, section 3.1. We experimented with
a large variety of different values for the tuning parameter and found T = 6 to
be a reliably good. The larger this parameter, the smaller the impact of the
robust statistics weighting, conversely as this parameter approaches 1, more and
more of the image will be affected and potentially causing bright sources to be
suppressed.
The sole application of the robust statistics weighting reduces the number of
artifacts and noise around and between objects, while also resolving a greatly
improved PSF, see bottom-left of fig. 5. Notice the lack of halos around the
objects, as well as the reduced number of the background speckles.
To further reduce these background artifacts and prevent erratic updates
from disrupting fainter sources, ie. the top center region of the image, we
introduce convergence control using the update clipping, section 3.2. In our
testing we found d = 2.0 to be a good clipping threshold. On its own (top-right
in fig. 5), the clipping clears up the vast majority of background speckles.
In combination these methods produce clean and sharp images, which far
exceed the quality of the SDSS coadd, which was generated using a similar set
of input image, and even exceeding the quality of the CFHTLS coadd. For a fair
comparison we show an excerpt of our results with and without super resolution
enabled to match the SDSS and CFHTLS coadds, see fig. 6.
Point sources are less challenging to deconvolve than complex sources, galax-
ies are a good benchmark for validating the quality of our PSFs, as there is more
structure that would otherwise get washed out by a poor PSF. In fig. 7 we show
12
Figure 6: Our standard (bottom-left) and super resolution (bottom-right) results show a clear
improvement in Signal-to-noise as well as a much smaller PSF as compared to both the SDSS
Coadd (top-left) and CFHTLS Coadd (top-right). Our super resolution result produces images
in comparable resolution and detail to CFHTLS, which is an impressive achievement given
that CFHTLS is a much deeper survey with more than double the resolution.
our robust performance on a spiral galaxy, note the additional detail available
on the spiral arms, also note the similarity in the structures of the CFHT coadd.
13
Figure 7: Deconvolution of complex sources such as galaxies are a good benchmark of how
well a PSF is formed. Here we compare our result (bottom-right) against a typical input frame
(top-left), as well as the SDSS (top-right) and CFHTLS (bottom-left) coadds.
14
the processing time. The times given here are based on our testing completed
with SDSS images.
Acknowledgment
15
References
16
[18] Lucy, L., & Hook, R. 1992, in Astronomical Data Analysis Software and
Systems I, vol. 25, 277
[19] Lucy, L. B. 1974, The Astronomical Journal, 79, 745
[20] Maronna, R., Martin, D., & Yohai, V. 2006, Robust statistics (John Wiley
& Sons, Chichester. ISBN)
[21] Nickolls, J., Buck, I., Garland, M., & Skadron, K. 2008, Queue, 6, 40
[22] Nunez, J., & Llacer, J. 1993, Publications of the Astronomical Society of
the Pacific, 1192
17