PCR_2020
PCR_2020
net/publication/339703840
CITATIONS READS
19 255
4 authors, including:
All content following this page was uploaded by Zhangkai Ni on 01 October 2020.
Abstract—In this paper, a progressive collaborative represen- the cost and physical size, almost all consumer-grade digital
tation (PCR) framework is proposed that is able to incorporate cameras exploit a single image sensor covered with a color
any existing color image demosaicing method for further boosting filter array (CFA) on the sensors surface such that only one
its demosaicing performance. Our PCR consists of two phases:
(i) offline training and (ii) online refinement. In phase (i), color-component value can be registered at each pixel location
multiple training-and-refining stages will be performed. In each on the CFA. The most widely-used CFA pattern is the so-called
stage, a new dictionary will be established through the learning Bayer’s pattern [1]. Such recorded CFA data is commonly
of a large number of feature-patch pairs, extracted from the termed as a mosaicked image. In order to produce a full color
demosaicked images of the current stage and their corresponding image from the mosaicked image, the other two missing color
original fullcolor images. After training, a projection matrix will
be generated and exploited to refine the current demosaicked component values at the same location are required to be
image, which will be used as the input for the next training- estimated, and this process is called the demosaicing.
and-refining stage. At the end of phase (i), all the projection Due to strong correlations existing among three color chan-
matrices generated as above-mentioned will be exploited in phase nels in nature, many demosaicing algorithms estimate the
(ii) to conduct online demosaicked image refinement of the test missing color-component values based on the color difference
image. Extensive simulations conducted on two commonly-used
test datasets (i.e., the IMAX and Kodak) for evaluating the demo- (CD) fields (e.g., R-G or B-G) [2]–[8]. The effectiveness of
saicing algorithms have clearly demonstrated that our proposed using this strategy is due to the fact that each generated CD
PCR framework is able to constantly boost the performance field tends to yield a fairly smooth data field that is highly
of any image demosaicing method we experimented, in terms beneficial to the estimation of those missing color-component
of objective and subjective performance evaluations. However, values. Since the Bayer’s pattern has twice the number of the G
it is important to note that the performance gain contributed
from our PCR framework depends on how advanced of the channel samples as that of the R channel and of the B channel,
demosaicing method used and how difficult of the mosaicked the reconstruction of a full G channel is thus considered as
image under demosaicing. the most crucial and first step to achieve, from which the
Index Terms—Image demosaicing, color filter array (CFA), full R channel and the full B channel are then constructed,
residual interpolation, progressive collaborative representation. respectively.
Hamilton et al. [2] proposed to estimate the absent green-
channel pixel values along the horizontal and vertical direc-
I. I NTRODUCTION tions individually, based on the first-order derivatives of the
OLOR image demosaicing has been an important issue down-sampled G channel and the second-order derivatives
C in the field of image processing. It plays a crucial role
in producing high-quality color imagery from a singlesensor
of the down-sampled R and B channels. Zhang et al. [3]
proposed directional linear minimum mean square error esti-
digital camera. The demosaicing processing aims to recon- mation (DLMMSE) that optimally combines color differences
struct a full-color image from the acquired mosaicked image along the horizontal and vertical directions. Pekkucuksen et al.
by estimating the values of the other two missing color proposed the gradient-based threshold-free (GBTF) [4] and
components at each pixel position. In general, a fullcolor their improved algorithm, multiscale gradients-based (MS-
image is composed of three primary color components (i.e., G) [7], which is able to yield more pleasant visual results. For
red, green, and blue, which are denoted by R, G, and B, a survey of CD-based demosaicing algorithms, refer to [9].
respectively) at each pixel location. However, considering In recent years, many image process algorithms have been
developed by leveraging the potentials of convolutional neural
The research of the project was supported by Ministry of Education, network (CNN) to alleviate the dependence on hand-crafted
Republic of Singapore, under grants AcRF TIER 1- 2017-T1-002-110 and priors. Tan et al. [10] proposed a two-stage CNN-based
2015-T2-2-114
Z. K. Ni is with School of Electrical and Electronic Engineering, Nanyang demosaicking algorithm, while Cui et al. [11] proposed a
Technological University, 639798 Singapore, and also with the Department three-stage CNN-based demosaicking method. Tan et al. [12]
of Computer Science, City University of Hong Kong, Hong Kong (e-mail: proposed a multiple CNN structure for demosaicking. Besides,
[email protected]).
K.-K. Ma is with School of Electrical and Electronic Engineering, Nanyang Gharbi et al. [13] and Kokkinos et al. [14] designed a fully
Technological University, 639798 Singapore (e-mail: [email protected]). CNN to perform joint demosaicking and denoising.
H. Q. Zeng is with School of Information Science and Engineering, Huaqiao There are many other demosaicing approaches other than
University, Xiamen 361021, China (e-mail: [email protected]).
B. Zhong is with School of Computer Science and Technology, Soochow the CD-based methods [5], [15]–[27]. It turns our that the most
University, Suzhou 215006, China (e-mail: [email protected]). promising and effective demosaicing approach is to exploit the
2
IRI
Di(1) P (1) Di(2) P (2) Di(3) P ( N 1) Di( N )
Training Refine Training Refine Training Refine Training
P( N )
IRI Refine
Di( N 1)
Mosaic
IRI Ground Truth Final Output
Fig. 1. Proposed progressive collaborative representation (PCR) framework for progressively refining color image demosaicing.
prediction residuals (PRs), rather than CDs. Based on the PR different. In a way, the IRI can be viewed as the predictor,
field, this new category of demosaicing algorithms is called and any predictor is subject to incur prediction or estimation
the residual interpolation (RI) methods [23]–[27], in which errors. In this case, the errors are resulted from the use of the
the guided filter is commonly exploited to estimate the missing GF. To reduce this error, the proposed PCR can be viewed
color component. That is, the estimation of a certain channel is as a corrector for correcting these errors through multiple
generated under the guidance of another channel. The residual refinement stages as shown in Fig. 1, which consists of two
(i.e., estimation error), resulted by the GF, is defined as the phases: (i) offline training and (ii) online refinement. The
difference yielded between the sensor-registered pixel value proposed PCR is developed based on the motivation that the
(i.e., the ground truth) and the pixel value estimated by the GF. lost of image details in a demosaicked image can be recovered
Compared with the CD-based approach, the RI-based methods through a sequence of training-and-refining stages to achieve
are advantageous on both peak signal-to-noise (PSNR) and progressive demosaicing.
subjective visual quality. The success lies in the fact that the The remainder of this paper is organized as follows. In
smoother the data field (i.e., the CD field versus the PR field Section II, the proposed PCR demosaicing method is presented
in this case) under interpolation, the better the interpolation in detail. In Section III, extensive performance evaluations of
results. It is worth to highlight that, among all of existing RI- the proposed PCR and other state-of-the-arts are performed
based methods, the IRI [25], [26] achieves a great balance and compared. Section IV concludes the paper.
between the demosaicing performance and the computational
complexity. II. I MAGE D EMOSAICING U SING P ROGRESSIVE
C OLLABORATIVE R EPRESENTATION (PCR)
Rather than developing another stand-alone demosaicing al-
gorithm, a generic image refinement framework is proposed in A. Overview
this paper, called the progressive collaborative representation In this paper, a progressive collaborative representation
(PCR), as illustrated in Fig. 1. To demonstrate its efficacy and (PCR) is proposed for conducting color image demosaicing, as
effectiveness, the proposed PCR framework is exploited for illustrated in Fig. 1, which consists of two phases: (i) offline
conducting color image demosaicing in our work. The PCR is training (top part) and (ii) online demosaicing (bottom part).
generic in the sense that any existing demosaicing algorithm For the phase (i), multiple training-and-refining stages will
can be incorporated into our developed PCR framework for be performed. The goal is to generate a projection matrix at
increasing its originally-achieved demosaicing performance the end of each training stage such that it can be exploited
further. In Fig. 1, one can see that the IRI [26] demosaicing for refining the demosaicked images of the current training
method is used for demonstration, while other demosaicing stage. The refined demosaicked images will be used for the
algorithms can be also used to replace the IRI. In our view, next training-and-refining stage.
the proposed PCR framework bears the similar predictor- In the online demosaicing stage, the given mosaicked image
corrector strategy [28], despite that the algorithm is completely is subject to be demosaicked. For that, a chosen demosaicing
3
algorithm (i.e., the IRI [26] in this work) will be applied to their average energy. After PCA, these ‘newly’ generated
(1)
produce an initial demosaicked image as the starting stage feature patches yi (where i = 1, 2, . . . , Ny ) for the respective
for conducting progressive refinements—based on the the (1)
demosaicked images Di are used to learn the dictionary Ξ(1)
projection matrices supplied from the offline training stage. by following a similar processing pipeline as performed in
Our proposed PCR algorithm is detailed in the following sub- the K-SVD method [29]; i.e., the dictionary Ξ(1) and ci are
sections, respectively. required to be generated simultaneously via the following:
h i X (1)
Ξ(1) , {ci } = arg min ||yi − Ξ(1) · ci ||2 ,
B. Phase 1: Compute the Projection Matices via Offline
[Ξ ,{ci }] i
(1)
(1)
Training
s.t. ||ci ||0 ≤ L, ∀i = 1, 2, . . . , Ny ,
As shown in Fig. 1, the original full-color images (i.e.,
the ground truth) Oi (where i = 1, 2, . . . , NO ), the simulat- where ci are the coefficients corresponding to the features
(1)
ed mosaicked images Mi , and their respective demosaicked yi and L = 3 is the maximal sparsity set for the training
(1)
images Di are prepared for the training stage, from which dictionary.
(1) In our work, the color image demosaicing refinement is
the original Oi and the demosaicked images Di are used to
form collaborative representations as the inputs for conducting considered as a collaborative representation problem [30].
training for generating the first projection matrix P (1) . Note That is, given an IRI-demosaicked image, the feature patches
(1)
that the superscript (1) here denotes the first iteration. Such xi are obtained from the image by processing it in a similar
(1)
practice will be applied to all other defined variables that also way as that of obtaining yi . The feature here is the Roberts’
involve iterations as well. The generated P (1) is then used to edge detection and Laplacian high-pass filtered results. Such
(1)
refine the demosaicked images Di of the current (first) stage; filtering is performed in the horizontal and vertical directions
(1) (2) independently. The obtained features are then subject to per-
thus, Di is further denoted as Di , which is the input of
the subsequent stage. Such training-and-refining process will form PCA.
be iteratively performed for N times in the training stage. The above-described representation can be boiled down to
At the end, a set of N ’s projection matrices P (n) (where a least-squares regression problem and regularized by the l2 -
n = 1, 2, . . . , N ) will be obtained and to be used in phase 2 norm of the coefficient vector. First, we need to establish
(1) (1)
for performing online demosaicing. For the above-mentioned, a large number of feature patch pairs {fDi , fEi } (where
(1)
some important notes need to be highlighted as follows. i = 1, 2, . . . , Nf ), where fDi are extracted from a scaled
First, the simulation of the mosaicked images Mi are (1)
pyramid representation of the demosaicked image Di , while
generated from the ground-truth images Oi by subsampling (1) (1)
fEi are based on the (prediction) error image Ei computed
each image according to the Bayer’s pattern. Second, due to by subtracting the demosaicked image from the original full-
its superior demosaicing performance, the iterative residual color image. It is important to note that no dimensionality
interpolation (IRI) [26] is adopted in our work for generating reduction will be applied to the fEi .
(1)
(1)
the initial demosaicked images Di , respectively. Certainly, Next, we need to search a neighborhood for each atom
the same IRI should be used in the online demosaicing stage of the dictionary Ξ(1) as follows. For each atom dk of
(1)
as well. One might use any other demosaicing algorithm to (1)
the dictionary Ξ , its corresponding demosaicked image and
replace IRI in both offline training and online testing stages; (1)
full-color image nearest neighborhoods {NDk , NEk } will be
(1)
that is, the proposed PCR demosaicing framework is generic (1) (1)
identified from the {fDi , fEi }. Here, the absolute value of the
that should be able to boost the performance of any demo- (1)
saicing processor exploited in the framework. Third, two pre- dot product between the atom dk and each individual feature
(1)
processing steps—(i) feature extraction and (ii) dimensionality sample in fDi will be computed to measure their degree of
reduction (not shown in Fig. 1) had been applied to the similarity; that is,
(1)
feature pathes of demosaicked images Di before learning
(1) (1)
h
(1)
iT
(1)
the dictionary. This is to be further detailed as follows. δ dk , fDi = dk · fDi . (2)
Using the first iteration for illustration, the IRI-generated
(1) Thus, the above-mentioned collaborative representation prob-
demosaicked images Di are filtered by using the Roberts
lem can be formulated as
and Laplacian high-pass filter (serving as feature extractors,
(1) (1)
each filter performed in the horizontal and vertical direction, min ||xi − NDk · ω||22 + λ||ω||2 , (3)
ω
respectively), followed by segementing the filtered images
(1)
into 3 × 3 image patches, collectively denoted as the fea- where NDk represents the identified neighborhood from the
(1)
ture patches for their respective demosaicked images Di . demosaicked image for solving the problem, ω is a regular-
(1) (1)
Further note that the above-described steps for generating ization coefficient of xi over NDk , and λ is used to alleviate
such feature patches will be applied to three color channels the ill-posed problem. The above-stated equation has a closed-
(i.e., R, G, and B) separately, followed by concatenating their form solution, which is given by [31]
resulted feature patches together. With consideration of the h iT −1 h iT
(1) (1) (1) (1)
computational complexity, the principal component analysis ω = NDk · NDk + λ · I · NDk · xi . (4)
(PCA) algorithm is further applied to these feature patches
for reducing their dimensionality while preserving 99.9% of where I is the identity matrix.
4
Algorithm 1 Compute the Projection Matrices via Offline Algorithm 2 Conduct Online Demosaicing with Progressive
Training Refinements
Input: Input:
A set of full-color training images Oi (where i = A test image Ii and the projection matrices P (n) .
1, 2, . . . , NO ). Output:
(N +1)
Output: The final refined demosaicking results Ii .
The projection matrices P (n) (where n = 1, 2, . . . , N ). Steps:
Steps: 1: Subsampling Ii to Mi according to the Bayer’s pattern,
(1)
1: Subsampling the original full-color training images Oi to followed by demosaicing Mi to Di via the IRI [26].
generate their mosaicked versions Mi according to the 2: for n = 1; n ≤ N ; n = n + 1; do
(n)
Bayer’s pattern, followed by demosaicing them back to 3: Generate demosaicking image patches zi (where i =
(1)
Di via the IRI [26]. 1, 2, . . . , NT ) and derive their high-frequency feature
(n) (n)
2: for n = 1; n ≤ N ; n = n + 1; do patches fDi from zi .
(n) (n)
3: Extract feature patches from the Di by extract- 4: Reduce the dimensionality of fDi by applying the PCA
ing their gradient and Laplacian features. Reduce the algorithm.
dimensionality of those feature patches by applying 5: for i = 1; i ≤ NT ; i = i + 1; do
(n) (n)
the PCA algorithm to obtain the yi (where i = 6: Search fDi with respect to Ξ(n) to find the nearest-
1, 2, . . . , Ny ). neighbor atoms via Eq. (2).
(n) (n+1)
4: Train its dictionary Ξ(n) across yi with K-SVD via 7: Compute the refined feature fDi via Eq. (7).
Eq. (1); 8: Reconstruct the refined demosaicking image patch
(n+1) (n+1) (n)
5: Establish the demosaicking image and full-color im- via zDi = fDi + zi .
(n) (n)
age feature patch pairs {fDi , fEi } (where i = 9: end for
(n) 10: Combine all the refined demosaicking image patches
1, 2, . . . , Nf ) from the training image pairs {Di , Oi }
(n+1)
by extracting the gradient and Laplacian features of the zi and average the pixel values on those over-
(n) lapping areas to form the refined demosaicking image
Di and subtracting the demosaicked image from the
(n+1)
corresponding Oi , respectively. Di .
6: for k = 1; k ≤ Nd ; k = k + 1; do 11: end for
(n) (N +1)
7: Search the demosaicking image neighborhoods NDk 12: Output the final demosaicking results Di .
(n) (n)
from fDi for dk .
(n) (n)
8: Search the corresponding NEk from fEi .
9: end for For training in order to generate the projection matrix P (n) ,
10: Compute the n-th projection matrix P (n) in which models to successively refine the IRI demosaicked images, and
(n) (n) (n) (n) (n) (n)
pk = NEk · ([NDk ]T · NDk + λ · I)−1 · [NDk ]T . the outputs Di are then viewed as the trained demosaicked
images. For generating the projection matrix of the n-th iter-
(n) (n+1) ation via the matching of the feature pairs, the demosaicking
11: Refine Di to Di by projection matrix P (n) .
(n) (n)
12: end for image and full-color image feature pairs {fDi , fEi } are
13: Output the projection matrices P (n) . derived from the demosaicking image and original full-color
(n)
image pairs {Di , Oi }. After learning, the dictionary Ξ(n) ,
(n)
dictionary atoms dk , and the projection matrix P (n) using the
The refined demosaicked image patches can be reconstruct- similar training process in the first iteration will be generated.
ed by applying the same coefficients ω to the corresponding After conducting N iterations, N projection matrices will
(1)
full-color image feature patch neighborhoods NEk based on be generated. The offline training process of our proposed
(1)
the assumption that the feature patches xi and the corre- multi-level mapping models via learning is summarized in
sponding refined demosaicked image feature patch share the Algorithm 1.
(1) (1)
same coefficients ω over NDk and NEk , respectively [31].
Thus, the demosaicked image feature patch can be refined by C. Phase 2: Progressive Refinements of Demosaicked Images
computing the following
h −1 h Refer to Fig. 1, the objective of this stage is to progressively
iT iT
(2) (1) (1)
xi = NEk · NDk · NDk + λ · I
(1) (1) (1)
· NDk · xi . demosaic the given mosaicked image through multiple stages
of refinements. For that, our previously-developed demosaic-
(5) ing algorithm IRI [26] is exploited to generate a high-quality
(2) (1) (1)
Equation (5) can be formulated as xi = pk · xi ; that is, (1)
demosaicked image Di as the starting point (the Di here
(1)
h iT −1 h iT is the IRI result of the test images rather than training image).
(1) (1) (1) (1) (1)
pk = NEk · NDk · NDk + λ · I · NDk . (6) In each of the subsequent refinement stages, the corresponding
projection matrix (i.e., with the same iteration index) supplied
(1)
The projection matrix P (1) is composed of all pk , for k = from the offline training stage will be incorporated to improve
1, 2, . . . , Nd . the image quality of the current stage.
5
Let N be the last iteration of training in the offline symbol ||·||2 denotes the l2 norm of a vector, parameters H and
training stage. Therefore, N projection matrices P (n) (where W denote the height and the width of the image, respectively.
n = 1, 2, . . . , N ) will be generated and to be used in Note that both IoC (i, j) and IdC (i, j) are vectors, representing
this demosaicing stage. That is, in the n-th iteration, the the R, G, or B values at the pixel (i, j) from the original image
(n)
computed demosaicking image feature patches fDi (where and the demosaiced image, respectively. To further evaluate
i = 1, 2, . . . , NT ) and the corresponding demosaicking image image quality from the perceptual viewpoint of the human
(n) (n) (n+1)
patches zi from the Di will be used to reconstruct Di . visual system, the SSIM is computed as another supporting
(n) (n) performance-evaluation index to reflect the similarity yielded
For each feature patch fDi , its nearest atom dk in Ξ(n)
via Eq. (2) will be identified. Then, the refined feature patch between the original image (denoted by subscript o) and the
(n+1) demosaicked images (denoted by subscript c). It considers the
fDi is computed by
degradations incurred on the demosaicked image’s structure,
(n+1) (n) (n)
fDi = pk · fDi . (7) instead of its pixel differences. Three types of similarities are
(n+1) concerned in the SSIM—that is, the luminance similarity, the
The refined demosaicking image patch zi is then obtained contrast similarity, and the structural similarity. For a joint
(n+1)
by adding the refined feature patch fDi in (7) to the measurement of the SSIM, the average SSIM is obtained by
(n)
demosaicking image patch zi . By combing all the refined averaging the SSIM values individually obtained from the R,
(n+1)
patches zi and averaging the intensity values in the G, and B channels, respectively. Note that the higher the SSIM
overlapping areas, we can generate the refined demosaicking value, the better the perceptual quality of the demosaicked
(n+1)
image images Di . image.
The entire Stage 2 is summarized in Algorithm 2. Lastly, 4) Default Settings for the PCR’s Parameters: Unless oth-
it is important to note that the above-described progressive erwise specified, the simulations conducted for the perfor-
refinement processing is conducted for each of three color mance evaluation of our proposed PCR adopted the following
channels separately. default settings. The size of the FI/DI image patches is
set to 3 × 3 pixels with an overlap of 2 pixels between
III. E XPERIMENTS AND D ISCUSSIONS adjacent patches. Four one-dimensional high-pass filters (i.e.,
A. Experimental Settings f1 = [−1, 0, 1], f2 = [−1, 0, 1]T , f3 = [1, 0, −2, 0, 1], and
1) Training Dataset: For the learning-based demosaicing f4 = [1, 0, −2, 0, 1]T ) are used to extract the high-frequency
methods, the training dataset exploited for training has a direct details for each image patch, followed by applying the PCA for
impact on the quality of the demosaicked images. In our reducing dimensionality. The same dictionary training method
experiments, the proposed PCR is trained by using the same as described in [31] and [29] is exploited for the PCR; i.e.,
100 training images as the ones used in [22], and these training 4,096 atoms for the dictionary D(k) , a neighborhood size of
images do not adopt any data augmentation (i.e., rotation and 2,048 training samples, and 5 millions of the training samples
flipping) for increasing the size of the training dataset. from the DI and FI patches. The algorithm’s performance
2) Testing Dataset: The IMAX dataset (18 images) and resulted by various experimental settings will be investigated
Kodak dataset (24 images) are used in our experiments, since in the following sub-sections.
they have been widely adopted for assessing the performance
of demosaicing methods (e.g., [22], [26], [27]). Note that the B. Performance Analysis
images in the IMAX dataset have weaker spectral correlations
To evaluate the performance, the proposed PCR is compared
among three color channels and are considered to be more
with ten state-of-the-art demosaicing methods: the learned
challenging, while the images in the Kodak dataset have
simultaneous sparse coding (LSSC) [19], the gradient-based
stronger correlations among three color channels [32]. Each
threshold-free (GBTF) [4], the local directional interpola-
full-color test image is firstly down-sampled according to
tion and nonlocal adaptive thresholding (LDI-NAT) [5], the
the Bayer’s pattern, followed by conducting the demosaicing
multiscale gradients-based (MSG) [7], the residual interpola-
process using various demosaicing methods.
3) Evaluation Criteria: For objectively evaluating the per- tion (RI) [23], the minimized-Laplacian residual interpolation
formance of the demosaicing methods under comparisons, two (MLRI) [24], the directional difference regression (DDR) [22],
evaluation metrics are used; i.e., the color peak signal-to-noise the fused regression (FR) [22], the adaptive residual inter-
ratio (CPSNR) and the structural similarity (SSIM) [33]. The polation (ARI) [27], and the iterative residual interpolation
former is applied to compute the intensity differences between (IRI) [26]. Note that the source codes of these methods are all
individual channel of the original image (i.e., the ground truth) downloaded from their corresponding authors.
and the corresponding counterpart of the demosaiced images; 1) Objective Performance Analysis: Firstly, Table I com-
that is, pares the proposed method with the state-of-the-art algorithms
in terms of average PSNR and CPSNR on IMAX dataset, Ko-
2552
CPSNR = 10 × log10 , where dak dataset, and combined dataset consisting of the IMAX and
CMSE Kodak datasets (denoted as “IMAX+Kodak”). Note that the D-
P PH PW C C 2
C∈{R,G,B} i j k Io (i, j) − Id (i, j) k2
DR [22] and FR [22] have two different sets of parameters for
CMSE = , the IMAX and Kodak datasets, we using each fixed parameter
3×H ×W
(8) set for all IMAX+Kodak images. Therefore the performance
6
TABLE I
AVERAGE PSNR AND CPSNR RESULTS ( IN D B) ON THE KODAK AND THE IMAX DATASETS . T HE FIRST- RANKED , THE SECOND - RANKED , AND
THIRD - RANKED PERFORMANCE IN EACH COLUMN IS HIGHLIGHTED IN RED , BLUE , AND BLACK BOLDS , RESPECTIVELY.
TABLE II
AVERAGE SSIM RESULTS ON THE KODAK AND THE IMAX DATASETS . T HE FIRST- RANKED , THE SECOND - RANKED , AND THIRD - RANKED PERFORMANCE
IN EACH COLUMN IS HIGHLIGHTED IN RED , BLUE , AND BLACK BOLDS , RESPECTIVELY.
evaluation using the combined dataset IMAX+Kodak has two Similar to Table I, Table II and Table III compares the
results for DDR and FR, respectively. From Table I, one proposed method with the same set of state-of-the-art algo-
can observe that our proposed PCR has achieved the best rithms in terms of SSIM and MSSIM, respectively. One can
average CPSNR among all these methods on the IMAX still observe that our proposed PCR consistently yields the
dataset. Moreover, one can see that the average CPSNR gain best performance on the IMAX dataset and the combined
of our proposed PCR compared to that of IRI [26] has yielded IMAX+Kodak dataset. For the Kodak dataset, our proposed
additional 1.1351 dB improvement on the IMAX dataset and method delivers a fairly close performance to the best method
1.0381 dB on the Kodak dataset. Considered the second best LSSC.
method, ARI [27], performed on the IMAX dataset, additional
0.5985 dB gain is achieved. Compared with three learning- Lastly, it is worthwhile to point out that several demosaicing
based methods (i.e., LSSC [19], DDR [22], and FR [22]) methods achieve good performance on the Kodak dataset,
experimented on the IMAX dataset, additional performance but not on the IMAX dataset. For example, the sparse-
gain of our proposed PCR are 2.0250 dB, 1.0473 dB, and representation-based LSSC [19] method has delivered the best
0.7203 dB, respectively. Furthermore, Table I also presents CPSNR performance on the Kodak dataset, its performance
the quantitative comparisons on individual color channel of drops drastically on the IMAX dataset. The DDR [22] and
all methods in terms of PSNR, from which one can see that FR [22] have also shown such trend. This might be due
the average PSNR for each color channel of our proposed PCR to the well-known fact that the spectral correlations existing
also delivers the best performance among all these methods on among three color channels in most test images from the
the IMAX dataset. In addition, our proposed approach PCR Kodak dataset are extraordinarily high, as discussed in several
also achieves the highest average CPSNR on the combined previous works (e.g., [32]). Since these methods favour to
IMAX+Kodak dataset. those color images with high degree of spectral correlations,
they tend to produce inferior demosaicked results otherwise,
7
TABLE III
AVERAGE MSSIM RESULTS ON THE KODAK AND THE IMAX DATASETS . T HE FIRST- RANKED , THE SECOND - RANKED , AND THIRD - RANKED
PERFORMANCE IN EACH COLUMN IS HIGHLIGHTED IN RED , BLUE , AND BLACK BOLDS , RESPECTIVELY.
(a) Scaled original IMAX 9 (b) Ground truth (c) LSSC [19] (d) GBTF [4]
(e) FR [22] (f) ARI [27] (g) IRI [26] (h) PCR (Ours)
Fig. 2. Visual comparisons for a close-up region on the image “IMAX 9” from the IMAX dataset.
such as the ones from the IMAX dataset. In contrast, our In Fig. 2, one can see that the zoom-in region contains rattan
proposed PCR is much insensitive to spectral correlations basket (with light brown and dark black colors) and the fruit
among three color channels and thus achieves more robust in red. Due to the fact that weak spectral correlations existing
performance regardless which dataset is used. among three color channels, GBTF [4], FR [22], ARI [27] and
2) Subjective Performance Analysis: Besides the superi- IRI [26] tend to yield zipper effect. Furthermore, some distinct
ority on objective evaluations, our proposed PCR method is color artifacts can be observed around the edges of the black
also superior to other demosaicing methods on the subjective rattan in Fig. 2 (d)-(g). Although the sparse-based LSSC [19]
quality assessment. Two representative test images from each method can yield a fairly similar demosaicked image to that of
dataset are selected for conducting visual comparison. Specifi- our proposed PCR method, one can easily observe that many
cally, Figs. 2-5 show the demosaicked results of four cropped- details of the PCR-demosaicked image are still superior to and
up sub-images (as indicated by the green-colored frame in each more natural than that of the LSSC [19] (e.g., the brown rattan
test image) for a close-up visual comparisons. area and less zipper effect). In fact, our PCR result is nearly
8
(a) Scaled original IMAX 12 (b) Ground truth (c) LSSC [19] (d) GBTF [4]
(e) FR [22] (f) ARI [27] (g) IRI [26] (h) PCR (Ours)
Fig. 3. Visual comparisons for a close-up region on the image “IMAX 12” from the IMAX dataset.
(a) Scaled original image (b) Ground truth (c) LSSC [19] (d) GBTF [4]
(e) FR [22] (f) ARI [27] (g) IRI [26] (h) PCR (Ours)
Fig. 4. Visual comparisons for a close-up region on the image “Kodak 1” from the Kodak dataset.
identical to the ground truth on this zoom-in image. which is often used to evaluate the demosaicked results of
Compared with the interpolation-based methods (i.e., GBT- highly-textured regions. One can see that our proposed PCR
F [4], ARI [27], and IRI [26]), the learning-based demosaicing has achieved distinctly superior demosaicked result, which is
methods (i.e., LSSC [19], FR [22], and our proposed PCR) almost identical to the original image. On the contrast, one
have produced much improved demosaicked images. Fig. 3 can observe obvious color aliasing and pattern shift resulted
demonstrates a close-up of the demosaicked image from the by LSSC [19], GBTF [4], FR [22], ARI [27], and IRI [26].
IMAX 12. Through comparison, one can see that our proposed Although the LSSC [19] and GBTF [4] have yielded much
PCR delivers the best visual quality on sharp edges and color better visual results than those in Fig. 4 (e)-(g), however they
details. On the contrary, the other methods under comparison still produced noticeable false color artifacts, as illustrated in
produced many visible zipper artifacts and false colors along Fig. 4 (c)-(d).
the edges of drawings, as shown in Fig. 3(c)-(g). To further evaluate the demosaicked results of those textured
Fig. 4 displays a zoom-in portion of a jalousie window, areas using our proposed PCR, Figs. 5 demonstrates a zoom-in
9
(a) Scaled original image (b) Ground truth (c) LSSC [19] (d) GBTF [4]
(e) FR [22] (f) ARI [27] (g) IRI [26] (h) PCR (Ours)
Fig. 5. Visual comparisons for a close-up region on the image “Kodak 8” from the Kodak dataset.
TABLE IV
AVERAGE RUNNING TIME ( IN SECONDS ) OF DIFFERENT METHODS FOR
DEMOSAICING ONE IMAGE ON THE IMAX AND KODAK DATASETS .
IMAX Kodak
Methods
(500 × 500) (512 × 768)
LSSC [19] 113.5446 173.7845
GBTF [4] 0.0713 0.1214
LDI-NAT [5] 277.1109 439.9941
MSG [7] 4.3644 6.6752
RI [23] 0.5473 0.9066
MLRI [24] 0.8638 1.3556
(a)
DDR [22] 4.8126 7.2928
FR [22] 8.5059 13.7943 Fig. 6. CPSNR versus running time (in log scale) of different methods
performed on the IMAX dataset.
ARI [27] 24.8025 39.5632
IRI [26] 2.4326 4.4233
PCR (Ours) 16.9421 26.6163
have a meaningful and fair comparison. The run-time results
of all demosaicing methods are documented in Table IV.
area (denoted as “roof”) that also has dense edges clustered in For a more intuitive comparison between the computational
a small region. It is quite clear to see that, all algorithms except complexity and demosaicing performance, Fig. 6 shows the
our proposed PCR introduce false color artifacts and/or edge CPSNR versus the running time of all demosaicing methods
distortion. Especially, the LSSC [19], GBTF [4], FR [22] and for comparison.
IRI [26] methods produce the most distinct color artifacts in From Table IV, it can be seen that the proposed PCR
this case, while the ARI [27] method has delivered much better is computationally more expensive than most non-learning-
demosaiced image quality. However, with a closer look, one based methods (i.e., GBTF [4], MSG [7], RI [23], MLRI [24],
can see that our proposed PCR has yielded the least amount of IRI [26]) due to its iterative processing flow. However, it
artifacts; e.g., refer to the top-right portion of the sub-image delivers considerably better demosaicing performance than
“roof”, where the ARI [27]has produced visible color leakage these fast methods, as described in Sections III-B1 and III-B2.
in blue. This study on visual quality has further demonstrated On the other hand, our proposed method is significantly more
the superiority of our proposed PCR. efficient than the LSSC [19] and LDI-NAT [5] methods, where
3) Computational Complexity Analysis: To analyze the the LSSC method delivers the most comparable demosaicing
computational complexity, the average running time per image performance to that of ours in terms of objective and subjective
incurred for each demosaicing model, experimented on the evaluations. Moreover, the DDR [22] and FR methods achieve
IMAX and the Kodak databsets, are measured. The computer promising performance, however, these two methods highly
used for conducting our simulation experiments is equipped depend on the values set for the parameters, since they have
with an Intel Xeon CPU E5-1630 [email protected] with 32GBs different optimized parameter set on each dataset. This study
of RAM, and the software platform we exploited is Matlab have shown that the proposed PCR is able to achieve a
R2017b. Note that all the competing demosaicing models are good trade-off between the computational complexity and
performed under the same test conditions and procedures to demosaicked image?s quality.
10
(a) Number of Atoms (b) Number of Nearest Neighbors (c) Number of Training Samples (× 106 )
Fig. 7. A study of the resulted performances in CPSNR of the proposed PCR algorithm with respect to the values set for: (left) the atom number, (middle)
the nearest neighbor number, and (right) the number of training samples. These simulations are conducted on the datasets IMAX (blue curve) and Kodak (red
curve).
TABLE V
C. Default Parameters Setting AVERAGE CPSNR RESULTS ( IN dB) AND SSIM OF PCR ON THE KODAK
AND THE IMAX DATASETS WITH DIFFERENT NUMBERS OF ITERATIONS .
In this sub-section, the influences affected by several key
parameters are analysed in our proposed PCR method—i.e.,
IMAX Kodak IMAX+Kodak
the number of atoms, the number of nearest neighbors, the Iterations
CPSNR SSIM PSNR CPSNR CPSNR SSIM
number of training samples, and the number of iterations. The
1 38.0335 0.9670 40.3675 0.9846 39.3672 0.9771
CPSNR performance of the proposed PCR performed on the 2 38.1948 0.9672 40.7562 0.9859 39.6585 0.9779
IMAX and Kodak datasets are evaluated and presented in this 3 38.2563 0.9676 40.7792 0.9860 39.6979 0.9781
sub-section, while similar conclusions can be drawn on the 4 38.2725 0.9676 40.8279 0.9861 39.7327 0.9782
evaluation of the SSIM results. In order to determine the best
default value for each parameter, experimental simulations are
performed by varying the values of the parameter under study,
In our experiments, 10 million is set as the default value as
while fixing other parameters.
our training samples.
1) Influence of the Number of Atoms: In Fig. 7, the first
4) Influence of the Number of Iterations: Since the pro-
plot shows the average CPSNR (dB) results yielded on the
posed PCR conducts demosaicing on the input image in an
IMAX (blue curve) and Kodak (red curve) by experimenting
iterative way, thus the more the number of iterations, the
different values of the atom numbers. It can be observed that
better the demosaiked results, at the expense of consuming
by increasing the value of the atom numbers, the CPSNR gain
more running time. Table V reports the average CPSNR and
resulted by the proposed PCR becomes larger. However, the
SSIM results of PCR on the datasets IMAX and Kodak
gain starts to have slight increment when the atom number
under different numbers of iterations. As one can see from
value becomes larger than 212 (i.e., 4,096). Therefore, 4,096
Table V, both CPSNR and SSIM increase with more iterations.
is set as the default value for the atom numbers in our
Specifically, the demosaicked results are greatly improved after
experiments.
2) Influence of the Number of Nearest Neighbors: The the second iteration and slightly increased in further iterations.
second column of Fig. 7 presents the changes in CPSNR As a result, the iteration number is set at 2 as the default value
of the proposed PCR by using various values of the nearest for our proposed PCR.
neighbor number. One can see that when the number of nearest
neighbors increases from 29 to 211 , the CPSNR performance IV. C ONCLUSIONS
gain is significantly improved, but starts to level off beyond In this paper, a new color image demosaicing framework is
211 . This implies that larger nearest neighbor number will proposed, called the progressive collaborative representation
lead to better demosaicing results and higher performance. (PCR). In the offline training stage, the intermediate demo-
However, a larger value of nearest neighbor number will saicked outputs of the last iteration and their corresponding
require higher computational complexity on both training and original full-RGB images are used to learn the multi-level
demosaicing stages. Based on the above observation, 211 (i.e., mapping models. The established multi-level mapping models
2,048) is selected as the value of nearest neighbor number in are then used to refine the intermediate demosaicked results in
our experiments. an iterative manner for producing the final demosaicked image.
3) Influence of the Number of Training Samples: The third Extensive experiments conducted on two widely-used test
column of Fig. 7 shows the average CPSNR results on the datasets for evaluating color image demosaicing have clearly
combined dataset IMAX+Kodak by using various numbers of shown that our proposed demosaicing method can achieve
training samples. It is easy to figure out that the number of the best performance in terms of quantitative evaluation and
training samples has a weak impact on the performance of the subjective assessment.
proposed PCR. In addition, unlike the influences introduced
by atoms and nearest neighbors, there is a random variation in
performance gains incurred by training samples. That is to say, R EFERENCES
more training samples cannot guarantee higher performance. [1] B. E. Bayer, “Color imaging array,” U.S. Patent 3971065, Jul. 20, 1976.
11
[2] J. F. Hamilton Jr and J. E. Adams Jr, “Adaptive color plan interpolation [28] Z. L. Baojiang Zhong, Kai-Kuang Ma, “Predictor-corrector image inter-
in single sensor color electronic camera,” U.S. Patent 5629734, May 13, polation,” J. Vis. Commun. and Image Represent., vol. 61, pp. 50–60,
1997. May 2019.
[3] L. Zhang and X. Wu, “Color demosaicking via directional linear min- [29] R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using
imum mean square-error estimation,” IEEE Trans. on Image Process., sparse-representations,” in Proc. Int. Conf. Curves Surfaces, 2010, pp.
vol. 14, no. 12, pp. 2167–2178, Nov. 2005. 711–730.
[4] I. Pekkucuksen and Y. Altunbasak, “Gradient based threshold free color [30] L. Zhang, M. Yang, and X. Feng, “Sparse representation or collaborative
filter array interpolation,” in Proc. IEEE Int. Conf. on Image Process., representation: Which helps face recognition?” in Proc. IEEE Int. Conf.
Sep. 2010, pp. 137–140. Comput. Vis., Nov. 2011, pp. 471–478.
[5] L. Zhang, X. Wu, A. Buades, and X. Li, “Color demosaicking by local [31] R. Timofte, V. De Smet, and L. Van Gool, “A+: Adjusted anchored
directional interpolation and nonlocal adaptive thresholding,” J. Electron. neighborhood regression for fast super-resolution,” in Proc. IEEE Asian
Imag., vol. 20, no. 2, p. 023016, 2011. Conf. Comput. Vis., 2014, pp. 111–126.
[6] I. Pekkucuksen and Y. Altunbasak, “Edge strength filter based color filter [32] F. Zhang, X. Wu, X. Yang, W. Zhang, and L. Zhang, “Robust color
array interpolation,” IEEE Trans. on Image Process., vol. 21, no. 1, pp. demosaicking with adaptation to varying spectral correlations,” IEEE
393–397, Jan. 2012. Trans. on Image Process., vol. 18, no. 12, pp. 2706–2717, Aug. 2009.
[7] I. Pekkucuksen and Y. Altunbasak, “Multiscale gradients-based color [33] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli et al., “Image
filter array interpolation,” IEEE Trans. on Image Process., vol. 22, no. 1, quality assessment: from error visibility to structural similarity,” IEEE
pp. 157–165, Jan. 2013. Trans. on Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.
[8] X. Li, “Demosaicing by successive approximation,” IEEE Trans. on
Image Process., vol. 14, no. 3, pp. 370–379, Mar. 2005.
[9] X. Li, B. Gunturk, and L. Zhang, “Image demosaicing: A systematic
survey,” Proc. SPIE, vol. 6822, p. 68221J, Jan. 2008.
[10] R. Tan, K. Zhang, W. Zuo, and L. Zhang, “Color image demosaicking
via deep residual learning,” in Proc. IEEE Int. Conf. Multimedia Expo
(ICME), Jul. 2017, pp. 793–798.
[11] K. Cui, Z. Jin, and E. Steinbach, “Color image demosaicking using a
3-stage convolutional neural network structure,” in Proc. IEEE Int. Conf.
on Image Process., Oct. 2018, pp. 2177–2181.
[12] D. S. Tan, W.-Y. Chen, and K.-L. Hua, “Deepdemosaicking: Adaptive
image demosaicking via multiple deep fully convolutional networks,”
IEEE Trans. on Image Process., vol. 27, no. 5, pp. 2408–2419, Feb.
2018.
[13] M. Gharbi, G. Chaurasia, S. Paris, and F. Durand, “Deep joint demo-
saicking and denoising,” ACM Trans. Grph, vol. 35, no. 6, p. 191, 2016.
[14] F. Kokkinos and S. Lefkimmiatis, “Iterative joint image demosaicking
and denoising using a residual denoising network,” IEEE Trans. on
Image Process., vol. 28, no. 8, pp. 4177–4188, Mar. 2019.
[15] B. K. Gunturk, Y. Altunbasak, and R. M. Mersereau, “Color plane
interpolation using alternating projections,” IEEE Trans. on Image
Process., vol. 11, no. 9, pp. 997–1013, Nov. 2002.
[16] L. Chen, K.-H. Yap, and Y. He, “Subband synthesis for color filter array
demosaicking,” IEEE Trans. on Systems, Man, and Cybernetics-Part A:
Systems and Humans, vol. 38, no. 2, pp. 485–492, Mar.2008.
[17] B. Leung, G. Jeon, and E. Dubois, “Least-squares luma–chroma de-
multiplexing algorithm for bayer demosaicking,” IEEE Trans. on Image
Process., vol. 20, no. 7, pp. 1885–1894, Jul. 2011.
[18] L. Fang, O. C. Au, Y. Chen, A. K. Katsaggelos, H. Wang, and
X. Wen, “Joint demosaicing and subpixel-based down-sampling for
bayer images: A fast frequency-domain analysis approach,” IEEE Trans.
on Multimedia, vol. 14, no. 4, pp. 1359–1369, Aug. 2012.
[19] J. Mairal, F. R. Bach, J. Ponce, G. Sapiro, and A. Zisserman, “Non-local
sparse models for image restoration.” in IEEE Int. Conf. on Comput. Vis.,
vol. 29, Sep. 2009, pp. 54–62.
[20] J. Li, C. Bai, Z. Lin, and J. Yu, “Optimized color filter arrays for sparse
representation-based demosaicking,” IEEE Trans. on Image Process.,
vol. 26, no. 5, pp. 2381–2393, 2017.
[21] J. Duran and A. Buades, “Self-similarity and spectral correlation adap-
tive algorithm for color demosaicking,” IEEE Trans. on Image Process.,
vol. 23, no. 9, pp. 4031–4040, May 2014.
[22] J. Wu, R. Timofte, and L. Van Gool, “Demosaicing based on directional
difference regression and efficient regression priors,” IEEE Trans. on
Image Process., vol. 25, no. 8, pp. 3862–3874, Aug. 2016.
[23] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Residual interpolation
for color image demosaicking,” in Proc. IEEE Int. Conf. on Image
Process., Sep. 2013, pp. 2304–2308.
[24] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Minimized-laplacian
residual interpolation for color image demosaicking,” Proc. SPIE, vol.
9023, p. 90230L, Mar. 2014.
[25] W. Ye and K.-K. Ma, “Image demosaicing by using iterative residual
interpolation,” in Proc. IEEE Int. Conf. on Image Process., Oct.2014,
pp. 1862–1866.
[26] W. Ye and K.-K. Ma, “Color image demosaicing using iterative residual
interpolation,” IEEE Trans. on Image Process., vol. 24, no. 12, pp. 5879–
5891, Sep. 2015.
[27] Y. Monno, D. Kiku, M. Tanaka, and M. Okutomi, “Adaptive residual
interpolation for color and multispectral image demosaicking,” Sensors,
vol. 17, no. 12, p. 2787, Dec. 2017.