Article in Press: Multi-Focus Image Fusion Based On Cartoon-Texture Image Decomposition
Article in Press: Multi-Focus Image Fusion Based On Cartoon-Texture Image Decomposition
Optik
journal homepage: www.elsevier.de/ijleo
8
20 a r t i c l e i n f o a b s t r a c t
9
10 Article history: In order to represent the source images effectively and completely, a multi-component fusion approach
11 Received 2 January 2015 is proposed for multi-focus image fusion. The registered source images are decomposed into cartoon and
12 Accepted 12 October 2015 texture components by using cartoon-texture image decomposition. The significant features are selected
13 Available online xxx
from the cartoon and texture components, respectively to form a composite feature space. The local
14 features that represent the salient information of the source images are integrated to construct the fused
15 Keywords:
image. Experimental results demonstrate the proposed approach works better in extracting the focused
16 Image fusion
regions and improving the fusion quality compared to the other existing fusion methods in both spatial
17 Split bregman algorithm
18 Total variation
and transform domain.
19 Multi-component fusion
© 2015 Published by Elsevier GmbH.
21 1. Introduction to noise and is subject to incorrect selection of blocks from the cor- 43
22 Multi-focus image fusion can be defined as a process of combing the in-focus and out-of-focus pixels are partitioned in the same 45
23 substantial information from multiple images of the same scene block, which are selected to build the final fused image. Accord- 46
24 to create a single composite image that will be more suitable for ingly, the blocking artifacts are produced and may compromise the 47
25 human visual perception or further computer processing [1]. It has quality of the final fused image. To eliminate the blocking artifacts, 48
26 been proven to be an effective way to extend the depth of the researchers have proposed some improved schemes. Li et al. [5,6] 49
27 field. In general, the fusion methods can be categorized into two have selected the focused blocks by using learning based meth- 50
28 groups: spatial domain fusion and transform domain fusion [2]. In ods, such as artificial neural networks (ANN) and support vector 51
29 this paper, we concentrate on the spatial domain methods. machine (SVM). Due to the difficulty in obtaining empirical data 52
30 The spatial domain methods are easy to implement and have in most multi-focus image fusion cases, the learning based meth- 53
31 low computational complexity. The spatial domain fusion methods ods are not widely used. Fedorov et al. [7] have selected the best 54
32 can be divided into two categories: pixel based methods and region focus by titling source images with overlapping neighbourhoods 55
33 based methods. The simplest pixel based fusion method is to take and improved the visual quality of the fused image. But this method 56
34 the average of the source images pixel by pixel. However, the sim- is afflicted by temporal and geometric distortions between images. 57
35 plicity may reduce the contrast of the fused image. To improve the Aslantas et al. [8] have selected the optimal block-size by using 58
36 quality of fused image, some region based methods have been pro- differential evolution algorithm and enhanced the self-adaptation 59
37 posed to combine partitioned blocks or segmented regions based on of the fusion method. But this method requires longer computa- 60
38 their sharpness [3]. The sharpness is measured by using local spatial tional time. Wu et al. [9] have selected the focused patches from 61
39 features [4], such as energy of image gradient (EOG) and spatial fre- the source images by using a belief propagation algorithm. But 62
40 quency (SF). Then, the focused blocks or regions are selected from the algorithm is complicated and time-consuming. Goshtasby et al. 63
41 source images by simply copying them into the fused image. How- [10] have detected the focused blocks by computing the weight 64
42 ever, if the size of blocks is too small, the block selection is sensitive sum of the blocks. The iterative procedure is time-consuming. De 65
∗ Corresponding author. Tel.: +86 18637930027. ing of block-size. These schemes all achieve better performance 68
E-mail addresses: [email protected] (Y. Zhang), [email protected] (L. Chen), than the traditional methods and significantly inhibit the blocking 69
http://dx.doi.org/10.1016/j.ijleo.2015.10.098
0030-4026/© 2015 Published by Elsevier GmbH.
Please cite this article in press as: Y. Zhang, et al., Multi-focus image fusion based on cartoon-texture image decomposition, Optik - Int.
J. Light Electron Opt. (2015), http://dx.doi.org/10.1016/j.ijleo.2015.10.098
G Model
IJLEO 56558 1–6 ARTICLE IN PRESS
2 Y. Zhang et al. / Optik xxx (2015) xxx–xxx
93 of the neighbourhood region of each pixel in the cartoon and tex- Bregman iteration [25]. This algorithm is easy to implement and 132
94 ture components. The proposed method works well in inhibiting has low computational complexity. This paper performs cartoon- 133
95 the blocking artifacts and representing the source images. texture image decomposition based on ROF model by using Split 134
96 The rest of the paper is organized as follows. In Section 2, the Bregman algorithm. 135
97 basic idea of cartoon-texture image decomposition will be briefly Fig. 1(b and c) shows the cartoon-texture image decomposition 136
98 described, followed by the new method based on cartoon-texture results of source images ‘Clock’. It is obvious that the salient features 137
99 image decomposition for image fusion in Section 3. In Section 4, of the cartoon and texture components are corresponding to the 138
100 extensive simulations are performed to evaluate the performance local feature of objects in focus. Thus, the cartoon and texture com- 139
101 of the proposed method. In addition, several experimental results ponents can be useful to build a robust fusion scheme to accurately 140
102 are presented and discussed. Finally, concluding remarks are drawn discriminate the focused regions from defocused regions. In this 141
103 in Section 5. paper, the salient features of the cartoon and texture components 142
106 observed image f represents a real scene. The image f may contain
107 texture or noise. In order to extract the most meaningful infor- 3.1. Fusion algorithm 146
108 mation from f , most models [16–25] try to find another image u,
109 “close” to f , such that u is a cartoon or simplification of f . These In this Section, a novel method based on image decomposition is 147
110 models assume that the following relation between f and u: proposed. The source images must be initially decomposed into car- 148
112 where v is noise or texture. proposed fusion framework is depicted in Fig. 2 and the detailed 151
113 In 1989, Mumford et al. [16] have established a model to decom- design is described as follows. For simplicity, this paper assumes 152
114 pose the black and white static image by using bounded variation that there are only two source images, namely I A and IB , here. The 153
115 function. In 1992, Rudin et al. [17] have simplified the Mum- rationale behind the proposed scheme applies to the fusion of more 154
116 ford–Shah model and proposed total variation minimization energy than two multi-focus images. The source images are assumed to 155
118 EROF (u) = (||∇ u||dxdy) + (u − u0 )2 dxdy (2) Step 1: Perform cartoon-texture image decomposition on the 158
119 The ROF model is very efficient for de-noising images while and texture components, respectively. For the source image I B , UB , 161
120 keeping sharp edges. However, the ROF model will remove the VB have the roles similar to U A and VA . Moreover, the color source 162
121 texture when is small enough [18]. In 2002, Vese et al. [19]
122 have developed a partial differential equation (PDE) based iterative
123 numerical algorithm to approximate Meyer’s weaker norm || · ||G by
124 using Lp . However, this model is time consuming. To improve the
125 computation efficiency, many models and methods have been pro-
126 posed. They proposed Osher–Sole–Vese (OSV) [20] model based on
127 total variation (TV) and norm H −1 . Aujol et al. [21] have introduced
128 dual norm to image decomposition. Chana et al. [22] have pro-
129 posed CEP − H −1 model based on OSV. However, these methods
130 are still complicated. In 2008, Goldstein et al. [23] have proposed Fig. 2. Block diagram of proposed multi-focus images fusion framework.
Please cite this article in press as: Y. Zhang, et al., Multi-focus image fusion based on cartoon-texture image decomposition, Optik - Int.
J. Light Electron Opt. (2015), http://dx.doi.org/10.1016/j.ijleo.2015.10.098
G Model
IJLEO 56558 1–6 ARTICLE IN PRESS
Y. Zhang et al. / Optik xxx (2015) xxx–xxx 3
163 image can be processed as a single matrix made up of three indi- region of the pixel location (i, j) in UA , UB , VA and VB are, respec- 200
164 vidual source images. For a color image, the cartoon and texture tively, compared to determine which pixel is likely to belong to the 201
165 components can be defined as: focused regions. Two decision matrices H U and H V are constructed 202
in focus while “0” in H U indicates the pixel location (i, j) in image 208
174 There are two key issues [13] for the fusion rules. One is how to
UB is in focus. Likewise, the “1” in H V indicates the pixel location 209
175 measure the activity level of the cartoon and texture components,
(i, j) in image VA is in focus while “0” in H V indicates the pixel loca- 210
176 respectively, which recognizes the sharpness of the source images.
tion (i, j) in image VB is in focus. However, judging by EOG alone 211
177 Fig. 3(a–d) show the relationship between multi-component of
is not sufficient to distinguish all the focused pixels. There are thin 212
178 the source images ‘Clock’ and their 3D shapes. It is obvious that
protrusions, narrow breaks, thin gulfs, small holes, etc. in H U and 213
179 the salient protruding portions of the 3D shape of the multi-
H V . To overcome these disadvantages, morphological operations 214
180 component are corresponding to the salient regions of the cartoon
[26] are performed on H U and H V . Opening, denoted as H U ◦ Z and 215
181 and texture components, and the salient regions of the cartoon and
H V ◦ Z, is simply erosion of H U and H V by the structure element Z, 216
182 texture components are corresponding to the focused regions of
followed by dilation of the result by Z. This process can remove thin 217
183 the source images. Thus, we use the EOG of the pixels within a
gulfs and thin protrusions. Closing, denoted as H U • Z and H V • Z, 218
184 M × N (M = 2s + 1, N = 2t + 1) window of the cartoon and texture
is dilation followed by erosion. It can join narrow breaks and thin 219
185 components to measure the activity level, respectively. s and t are
gulfs. To correctly judge the small holes, a threshold is set to remove 220
186 all positive integers. The EOG is calculated as:
the holes smaller than the threshold. Thus, the final fused cartoon
⎧
221
196 within the sliding windows which cover the neighbourhood region source images as shown in Fig. 4. The upper two pairs are grayscale 229
199 pixel location (i, j) in VA and VB . The EOG of the neighbourhood eral, image registration should be performed before image fusion. 232
Fig. 3. The relationship between multi-component of the source images ‘Clock’ and their 3D shapes: (a) cartoon component of the far focused image, (b) cartoon component
of the near focused image, (c) texture component of the far focused image and (d) texture component of the near focused image.
Please cite this article in press as: Y. Zhang, et al., Multi-focus image fusion based on cartoon-texture image decomposition, Optik - Int.
J. Light Electron Opt. (2015), http://dx.doi.org/10.1016/j.ijleo.2015.10.098
G Model
IJLEO 56558 1–6 ARTICLE IN PRESS
4 Y. Zhang et al. / Optik xxx (2015) xxx–xxx
Fig. 4. Multi-focus source images: (a) near focused image ‘Disk’, (b) far focused image ‘Disk’, (c) near focused image ‘Lab’, (d) far focused image ‘Lab’, (e) far focused image
‘Rose’, (f) near focused image ‘Rose’, (g) far focused image ‘Book’ and (h) near focused image ‘Book’.
258 For visual evaluation, the fused images ‘Disk’ and ‘Lab’ obtained
259 by different fusion methods are shown in Figs. 5(a–f) and 6(a–f).
260 The difference between the fused images and their corresponding
261 far focused source image for ‘Lab’ are shown in Fig. 7(a–f). The fused
262 images obtained by the other fusion methods demonstrate obvious
Fig. 7. The difference images between right focused source image ‘Lab’ and corre-
sponding fused images obtained by LAP (a), DWT (b), NSCT (c), PCA (d), SF (e) and
proposed (f).
blurs and artifacts, such as the blurry regions of the white book in 263
Fig. 5(a–d), and the upper edge of the student’s head in Fig. 6(a–d). 264
Blocking artifacts appear in the fused images obtained by SF, such 265
as the upper edge of the clock in Fig. 5(e), and the upper edge of the 266
student’s head in Fig. 6(e). The contrast of the fused image obtained 267
by PCA is worst, and the contrast of the fused image obtained by 268
the proposed method is best. There are obvious residuals in the 269
difference images obtained by LAP, DWT, NSCT, PCA and SF. There 270
are distortions in the difference images in Fig. 7(a–c). There are a 271
few residuals in the left regions of Fig. 7(d and e). Thus, the fused 272
Fig. 5. The fused images ‘Disk’ obtained by LAP (a), DWT (b), NSCT (c), PCA (d), SF containing all the focused contents from the source images without 274
Please cite this article in press as: Y. Zhang, et al., Multi-focus image fusion based on cartoon-texture image decomposition, Optik - Int.
J. Light Electron Opt. (2015), http://dx.doi.org/10.1016/j.ijleo.2015.10.098
G Model
IJLEO 56558 1–6 ARTICLE IN PRESS
Y. Zhang et al. / Optik xxx (2015) xxx–xxx 5
Table 1 Table 2
The performance of different fusion methods for grayscale multi-focus images ‘Disk’ The performance of different fusion methods for color multi-focus images ‘Rose’ and
and ‘Lab’. ‘Book’.
LAP 6.14 0.69 0.91 7.10 0.71 0.91 LAP 5.75 0.69 2.32 6.94 0.71 3.12
DWT 5.36 0.64 0.64 6.47 0.69 0.59 DWT 5.07 0.66 1.01 6.42 0.69 1.57
NSCT 5.88 0.67 463.19 6.95 0.71 468.51 NSCT 5.33 0.69 222.86 6.51 0.69 366.40
PCA 6.02 0.53 0.32 7.12 0.59 0.08 PCA 5.73 0.70 0.04 7.16 0.62 0.14
SF 7.00 0.68 1.01 7.94 0.72 1.03 SF 7.15 0.72 1.90 7.71 0.70 3.10
Proposed 7.25 0.72 21.08 8.20 0.75 17.09 Proposed 7.30 0.73 42.14 7.87 0.73 66.82
276 For quantitative comparison, the quantitative results on as the right edge of the door frame in Fig. 8(e) and the cover of 293
277 grayscale multi-focus images in two quality measures and the run- the left book in Fig. 9(e). In addition, a blurry edge between the two 294
278 ning time are also shown in Table 1. The proposed method gains books appears in Fig. 9(e). The contrast of the fused images obtained 295
279 higher MI and Q AB/F values than the other methods. One can see by PCA is worst, such as the rose flower in Fig. 8(d) and the cover of 296
280 that the running time of the proposed method is larger than that the left book in Fig. 9(d). The contrast of the fused images obtained 297
281 of the other methods except for NSCT. Due to the sliding window by the proposed fusion method is better than other fusion methods. 298
282 technique is applied for the detection of focused regions, the com- Thus, the fused image of proposed method achieves better quali- 299
283 putation of EOG of all pixels of each sliding window in the proposed tative performance by integrating all the focused regions from the 300
284 method requires longer computational time. source images without introducing artifacts. 301
285 4.2. Comparison results on color images multi-focus images in two quality measures are shown in Table 2. 303
The proposed method gains higher MI and Q AB/F values than the 304
286 For visual evaluation, the fused images ‘Rose’ and ‘Book’ other methods. The computational time are also shown in Table 2. 305
287 obtained by different fusion methods are shown in Figs. 8(a–f) and Again, the proposed method needs longer running time than the 306
288 9(a–f). other methods except for NSCT. The drawback of the high compu- 307
289 The fused images obtained by LAP, DWT and NSCT demonstrate tation complexity lies in that the sliding widow searches the entire 308
290 obvious blurs and artifacts, such as the upper edge of the rose flower image to compute the EOG of all the pixels within the window, 309
291 in Fig. 8(a–c) and the right corner of the left book in Fig. 9(a–c). which inefficient coded using a loop structure in matlab. 310
292 Blocking arifacts appear in the fused images obtained by SF, such
5. Conclusion and future work 311
tures computed from the cartoon and texture components are 314
able to represent the salient information from the source images. 315
and color images. The qualitative and quantitative evaluation have 317
cantly improves the quality of the fused image. In the future, we 320
will consider optimizing the proposed method to reduce the time- 321
consuming. 322
Acknowledgements 323
Fig. 8. The fused image ‘Rose’ obtained by LAP (a), DWT (b), NSCT (c), PCA (d), SF
(e), proposed (f). The work was supported by National Key Technology Science Q3 324
References 330
[1] Y. Zhang, L. Chen, Z. Zhao, J. Jia, Multi-focus image fusion with robust principal 331
component analysis and pulse coupled neural network, Optik 125 (17) (2014) 332
5002–5006, 2014. 333
[2] S.T. Li, X.D. Kang, J.W. Hu, B. Yang, Image matting for fusion of multi-focus 334
images in dynamic scenes, Inf. Fusion 14 (2) (2013) 147–162. 335
[3] S.T. Li, J.T. Kwok, Y. Wang, Combination of images with diverse focus using 336
spatial frequency, Inf. Fusion 2 (3) (2001) 169–176. 337
[4] W. Huang, Z. Jing, Evaluation of focus measures in multi-focus image fusion, 338
Pattern Recognit. Lett. 28 (4) (2007) 493–500. 339
Fig. 9. The fused image ‘Book’ obtained by LAP (a), DWT (b), NSCT (c), PCA (d), SF [5] S. Li, J.T. Kwok, Y. Wang, Multifocus image fusion using artificial neural 340
(e) and proposed (f). networks, Pattern Recognit. Lett. 23 (8) (2002) 985–997. 341
Please cite this article in press as: Y. Zhang, et al., Multi-focus image fusion based on cartoon-texture image decomposition, Optik - Int.
J. Light Electron Opt. (2015), http://dx.doi.org/10.1016/j.ijleo.2015.10.098
G Model
IJLEO 56558 1–6 ARTICLE IN PRESS
6 Y. Zhang et al. / Optik xxx (2015) xxx–xxx
342 [6] S. Li, J. Kwok, I. Tsang, Y. Wang, Fusing images with different focuses using sup- [19] L.A. Vese, S.J. Osher, Modeling textures with total variation minimization and 371
343 port vector machines, IEEE Trans. Neural Networks 15 (6) (2004) 1555–1561. oscillating patterns in image processing, SIAM J. Sci. Comput. 19 (1–3) (2003) 372
344 [7] D. Fedorov, B. Sumengen, B.S. Manjunath, Multi-focus imaging using local 553–572. 373
345 focus estimation and mosaicking, in: Processings of IEEE International Con- [20] S. Osher, A. Sole, L. Vese, Image decomposition and restoration using total 374
346 ference on Image Processing, October, 2006, Atlanta, GA, USA, October, 2006, variation minimization and the H-1 norm, J. Sci. Comput. 1 (3) (2003) 349–370. 375
347 pp. 2093–2096. [21] J.F. Aujol, A. Chambolle, Dual norms and image decomposition models, Int. J. 376
348 [8] V. Aslantas, R. Kurban, Fusion of multi-focus images using differential evolution Comput. Vision 63 (l) (2005) 85–104. 377
349 algorithm, Expert Syst. Appl. 37 (12) (2010) 8861–8870. [22] T.F. Chana, S. Esedoglua, F.E. Park, Image decomposition combining staircase 378
350 [9] W. Wu, X.M. Yang, Y.P.J. Pang, G. Jeon, A multifocus image fusion method by reduction and texture extraction, J. Visual Commun. Image Represent. 18 (6) 379
351 using hidden Markov model, Opt. Commun. 287 (2013) 63–72. (2007) 464–486. 380
352 [10] A. Goshtasby, Fusion of Multifocus Images to Maximize Image Information, [23] T. Goldstein, S. Osher, The split bregman method for l1-regularized problems, 381
353 Defense and Security Symposium, Orlando, FL, 2006, pp. 17–21. SIAM J. Imag. Sci. 2 (2) (2009) 323–343. 382
354 [11] I. De, B. Chanda, Multi-focus image fusion using a morphology-based focus [24] Y.L. Wang, J.F. Yang, W.T. Yin, Y. Zhang, A new alternating minimization algo- 383
355 measure in a quad-tree structure, Inf. Fusion 14 (2) (2013) 136–146. rithm for total variation image reconstruction, SIAM J. Imag. Sci. 1 (3) (2008) 384
356 [12] Y.F. Li, X.C. Feng, Image decomposition via learning the morphological diversity, 248–272. 385
357 Pattern Recognit. Lett. 33 (2) (2012) 111–120. [25] S. Osher, M. Burger, D. Goldfarb, J.J. Xu, W.T. Yin, An iterative regularization 386
358 [13] Y. Jiang, M. Wang, Image fusion with morphological component analysis, Inf. method for total variation-based image restoration, Multiscale Model. Simul. 387
359 Fusion 18 (2014) 107–118. 4 (2) (2005) 460–489. 388
360 [14] W. Casaca, A. Paiva, E.G. Nieto, P. Joia, L.G. Nonato, Spectral image segmentation [26] X.Z. Bai, F.G. Zhou, B.D. Xue, Image fusion through local feature extraction 389
361 using Image decomposition and inner product-based metric, J. Math. Imaging byusing multi-scale top-hat by reconstruction operators, Optik 124 (18) (2013) 390
362 Vis. 45 (3) (2013) 227–238. 3198–3203. 391
363 [15] Z.C. Guo, J.X. Yin, Q. Liu, On a reaction-diffusion system applied to image decom- [27] Image sets: http://www.ece.lehigh.edu/spcrl , 2005. Q6 392
364 position and restoration, Math. Comput. Modell. 53 (5–6) (2011) 1336–1350. [28] Image sets: http://www.imgfsr.com/sitebuilder/images , 2009. 393
365 [16] D. Mumford, J. Shah, Optimal approximations by piecewise smooth functions [29] Image fusion toolbox: http://www.imagefusion.org/ . 394
366 and associated variational problems, Commun. Pure Appl. Math. 42 (5) (1989) [30] NSCT toolbox: http://www.ifp.illinois.edu/minhdo/software/ . 395
367 577–685. [31] Split Bregman toolbox: http://tag7.web.rice.edu/Split Bregman files/ . 396
368 [17] L. Rudin, S. Osher, E. Fatemi, Nonlinear total variation based noise removal [32] D.J.C. MacKay, Information Theory, Inference and Learning Algorithms, 397
369 algorithms, Phys. D: Nonlinear Phenom. 60 (1–4) (1992) 259–268. Cambridge University Press, 2003. 398
370 [18] Y. Meyer, Oscillating Patterns in Image Processing and Nonlinear Evolution [33] C.S. Xydeas, V. Petrovic, Objective image fusion performance measure, Electron. 399
Q5 Equations, University Lecture Series, 22, AMS, 2001. Lett. 36 (4) (2000) 308–309. 400
Please cite this article in press as: Y. Zhang, et al., Multi-focus image fusion based on cartoon-texture image decomposition, Optik - Int.
J. Light Electron Opt. (2015), http://dx.doi.org/10.1016/j.ijleo.2015.10.098