0% found this document useful (0 votes)

14 views34 pages

A survey 2019

Uploaded by

hau trinhvan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views34 pages

A survey 2019

Uploaded by

hau trinhvan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Accepted Manuscript

A survey on region based image fusion methods

Bikash Meher , Sanjay Agrawal , Rutuparna Panda ,

Ajith Abraham

PII: S1566-2535(17)30758-3
DOI: https://doi.org/10.1016/j.inffus.2018.07.010
Reference: INFFUS 1002

To appear in: Information Fusion

Received date: 12 April 2017

Revised date: 25 July 2018
Accepted date: 30 July 2018

Please cite this article as: Bikash Meher , Sanjay Agrawal , Rutuparna Panda , Ajith Abraham ,
A survey on region based image fusion methods , Information Fusion (2018), doi:
https://doi.org/10.1016/j.inffus.2018.07.010

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Research Highlights
 A state-of-the-art survey on region based image fusion is presented.
 A first hand classification of region based fusion methods is fostered.
 A comprehensive comparison among the existing methods is highlighted.
 A detailed analysis with encouraging results is presented.
 This survey may attract researchers to explore the domain of region based fusion.

T
IP
CR
US
AN
M
ED
PT
CE
AC

1
ACCEPTED MANUSCRIPT

A survey on region based image fusion methods

1
Bikash Meher, 1Sanjay Agrawal, 1Rutuparna Panda, 2Ajith Abraham

1
Department of Electronics & Telecommunication Engineering
VSS University of Technology, Burla, India
Email-id: [email protected]; [email protected]

T
2
Machine Intelligence Research Labs, Washington, USA

IP
Email-id: [email protected]

CR
Corresponding author: Rutuparna Panda, email-id: [email protected]

US
AN
Abstract
Image fusion has been emerging as an important area of research. It has attracted
many applications such as surveillance, photography, medical diagnosis, etc. Image fusion
M

techniques are developed at three levels: pixel, feature and decision. Region based image
fusion is one of the methods of feature level. It possesses certain advantages – less sensitive
ED

to noise, more robust and avoids misregistration. This paper presents a review of region based
fusion approaches. A first hand classification of region based fusion methods is carried out. A
PT

comprehensive list of objective fusion evaluation metrics is highlighted to compare the

existing methods. A detailed analysis is carried out and results are presented in tabular form.
CE

This may attract researchers to further explore the research in this direction.

Keywords: Image fusion, Region based fusion, segmentation.

2
ACCEPTED MANUSCRIPT

1. Introduction
Image fusion is the method of merging information from many images of the same
scene taken from various sensors, different positions or different time. The fused image
retains all the complementary and redundant information of the input images that are very
useful for human visual perception and image processing task. The objective of image fusion
is to fuse the details of the important information extracted from the two or multiple images.
In order to meet these objectives, the fused result should meet the following requirements: (a)

T
the fused image should retain the most complementary and significant information of the

IP
input images, (b) the fusion technique should not generate any synthetic information which
may divert the human observer or the advance image processing application, (c) it must avoid

CR
imperfect states, for instance, misregistration and noise [1]. It is observed from the literature
that image fusion approaches are classified into two types, spatial based and transform based.

linear manner. The fused image is US

In spatial based methods, the pixels of the images to be fused are combined in a linear or non-
expressed mathematically as
I F    I1 , I 2 ,..., I N   1I1   2 I 2 
AN
  N I N , where I1 , I 2 ,, I N are registered input images,
N
 denotes the fusion rule,  is a constant such that  n  1.
M

n 1

On the other hand, in transform based techniques, the input images to be combined
are transformed from the space domain to some other domain by applying appropriate
ED

transforms such as wavelets or pyramids. A suitable fusion rule is utilized to fuse the
transformed images. The inverse transform is utilized to reconstruct the original image. The
PT


fused image is represented mathematically as I F  T 1  T  I1  , T  I 2  ,..., T  I N   , where T 
1
CE

is the forward transformation operator and T is the inverse transformation operator. A

model of an image fusion concept is presented in Fig. 1.
AC

3
ACCEPTED MANUSCRIPT

Complementary Information

Sensor 1 Sensor 2

Redundant Information

T
Fig. 1. Image fusion example.

IP
As discussed above, the fused image will retain both the complementary and the redundant
information from the input images.

CR
On the basis of levels of abstraction, image fusion algorithms are applied at three
stages: pixel, feature and decision level. In pixel level, images are combined directly using

US
individual pixels to form the fusion decision. A comprehensive survey on pixel level image
fusion is found in [2-6]. The feature level image fusion is achieved by region based fusion
AN
scheme [7-9]. In region based image fusion, initially an image is partitioned into a set of
regions. The different features of these regions are extracted. The appropriate features from
all the source images are merged to get the fused image. They are less responsive to noise and
M

more robust. Further, the feature information is used for considering more intelligent
semantic fusion rules. A first hand survey on region based image fusion is presented in
ED

Section 2. Decision level image fusion techniques are based on the outputs of initial object
detection and classification task [10-12]. Usually, a preliminary decision from the feature
PT

based image fusion serves as the input to the decision level fusion. A generic classification of
image fusion methods is shown in Fig. 2.
CE

Image
Fusion
Methods
AC

Pixel Feature Decision

level Level level

Fig. 2. A generic classification of image fusion methods.

Most of the fusion applications need analysis of multiple images of the same scene for
improved results. For instance, in the medical imaging applications, computer tomography
(CT), magnetic resonance (MR) and positron emission tomography (PET) images are fused

4
ACCEPTED MANUSCRIPT
together for better analysis and diagnosis of the diseases. Similarly, in the remote sensing
applications, multispectral (MS) images, which have low resolution and high spectral density
are fused with panchromatic (PAN) images possessing high resolution and low spectral
density, to obtain the spectral contents of the MS image with high spatial resolution. In
surveillance applications, the different images (infrared, visible, near infrared) are taken from
different sensors and fused for detection or for night vision. In photography applications, the
multifocus images, multi-exposure images, etc. are fused to get the image which is better
perceived to the human vision and computer processing. Some of the examples of the
different image fusion applications are illustrated in Fig. 3. The methods used are described

T
in [13-16]. The following methods are used in Fig. 3 – a) deep convolutional neural network

IP
based fusion [13], b) optimum spectrum mask based fusion [14], c) non-subsampled

CR
contourlet transform and intensity-hue-saturation transform based fusion [15], and d) fuzzy
integral based fusion [16]. Some of the applications of image fusion are carried out in real
time. This has inspired the researchers to develop more effective techniques for image fusion.

US
The number of articles published has increased rapidly since 2010 [5].
AN
M

Source image A Source image B Fused image

ED
PT

MRI PET Fused image

(a)
AC

(b)

5
ACCEPTED MANUSCRIPT
IR Visible Fused image
(c)

PAN MS Fused image

(d)
Fig. 3. Example images of different image fusion applications, (a) Photography (Multifocus)
[13] (b) Medical diagnosis [14] (c) Surveillance [15] (d) Remote sensing [16].

T
A survey on different image fusion methods is done by many researchers. A

IP
classification of image fusion methods based on multi-scale decomposition technique is
described in [17]. A review of image fusion techniques in remote sensing is presented in [18-

CR
20]. Solanky and Katiyar [18] focussed on pixel based fusion methods in remote sensing.
Ghassemian [19] has carried out the survey on pixel, feature, and decision level with more

US
emphasis on pixel level method. Vivone et al. [20] presented a comparison among different
remote sensing image fusion algorithms, particularly focussing on multiresolution analysis
AN
and component substitution methods. Wang et al. [21] presented a review of image fusion
methods based on pulse coupled neural network (PCNN). An overview of multimodal
medical image fusion is described in [22,23]. The authors in [22] have used different image
M

reconstruction and decomposition techniques, namely multiresolution based, sparse

representation based and salient feature based. They have experimented with different fusion
ED

rules and compared the results using different image quality assessment parameters.
Most of these methods describe the image fusion techniques based on pixels.
PT

However, it has several limitations: (i) blurring effects, (ii) high sensitivity to noise and (iii)
misregistration. These limitations may be reduced by deploying the region based image
CE

fusion techniques. These techniques have the capability to utilize intelligent fusion rules. As
far as our knowledge is concerned, a survey on region based image fusion methods is not
AC

available in the literature. This has motivated us to carry out a survey on different region
based image fusion methods. In this context, we present a survey on image fusion methods
based on the regions. Mainly, the paper focusses on different approaches and ideas of the
existing region based fusion practices. The results of the different techniques are summarized.
An elaborate discussion is presented at the end, comparing the various fusion methods. This
may clearly set a path for more investigations into region based fusion techniques. This may
open doors to new unsolved problems in the domain of image fusion.

6
ACCEPTED MANUSCRIPT
The remainder of the manuscript has been structured as follows. Section 2 explains
the different region based image fusion methods. Section 3 presents the performance
evaluation parameters. An elaborate discussion on a comparison of various methods is
presented in Section 4. Finally, Section 5 draws the conclusion giving a brief summary and
critique of the findings.

2. Region based image fusion methods

It is observed from the literature that the feature level image fusion technique can be
further classified into machine learning, region based and similarity matching to content

T
based. In the machine learning method, the features are extracted and a suitable classifier is

IP
used for fusion. In region based method, the input images are divided into different regions
using some segmentation techniques. The features are extracted from the regions and suitable

CR
fusion rules are used to get the fused image. The similarity matching to content based image
retrieval technique uses the visual contents of an image, for instance, shape, texture, colour,

US
and spatial layout to denote the index. The relevant indices are combined to get the fused
image. A first hand classification of region based image fusion methods is proposed here. It is
AN
classified as shown in Fig. 4.

Feature Level
M
ED

Similarity matching to
Machine Learning Region Based
Content Based
PT

Focus region detection

Region partition Statistical and estimation
and saliency map based
CE

algorithm based algorithm

algorithm

Fig. 4. Classification of region based image fusion methods.

The region based image fusion is carried out in three different approaches – (i) the
region partition approach is based on partitioning the source images into distinct regions by
using standard segmentation methods. The characteristics of the regions are used to get the
fused image; (ii) the statistical and estimation based approach partitions the source images
into regions using advanced region segmentation algorithms followed by a statistical image
formation model. A joint region map is developed by analysing the region map of each
source image to produce the fused image; (iii) the focus region detection and saliency map

7
ACCEPTED MANUSCRIPT
based approach aims at the separation of the significant foreground object from the
background leading to perceptually coherent regions. The advantages and disadvantages of
these three approaches are mentioned later in this section.
A generic block diagram of the region based image fusion scheme is presented in Fig
5. The region based image fusion procedure reads two or more input images. The images are
segmented into different regions using various segmentation algorithms. The various features
like edge, texture, contour, etc. are extracted from each of the regions using suitable feature
extraction techniques. The features are joined using relevant fusion rules to get the fused
image

T
IP
Input
image A

CR
Feature
Feature Fused
Combination
Segmentation Extraction
using Fusion image
rule

US
Input
image B

Fig. 5. A schematic block representation of region based image fusion method.

AN
Based on the vast information available in the literature, the various region based image
fusion procedures are categorized into three classes as shown in Table 1.
M

Table 1
Classification of various region based image fusion algorithms.
Algorithm Method References
ED

Region partition DWT [24,25]

algorithm dual-tree complex wavelet transform (DT-CWT) [28]
DWT and highboost filter [29]
PT

nonsubsample contourlet transform (NSCT) [30,31]

shift invariant shearlet transform (SIST) [32]
discrete wavelet frame transforms (DWFT) [33]
CE

independent component analysis (ICA) [34-38]

Chebyshev-ICA [39]
ICA-support vector machine (SVM) (ICA-SVM) [40]
AC

artificial neural network (ANN) [43]

PCNN [46]
fuzzy logic and particle swarm optimization (FPSO) [47]
differential evolution (DE) [48]
bi-dimensional empirical mode decomposition (BEMD) [49]
region segmentation and spatial frequency (RSSF) [50]
region fusion structural similarity index (RF-SSIM) [51]
compressed sensing (CS) [52]
image matting [53]
sparse representation (SR) [54]
Statistical and expectation maximization (EM) [58]

8
ACCEPTED MANUSCRIPT
estimation based energy evaluation model [61,62]
algorithm bivariate alpha-stable (BαS) [63]
non expectation maximization (NEM) and bootstrap
sampling [64]
Focus region lifting stationary wavelet transform (LSWT) [70]
detection and NSCT and LSWT [71]
saliency map based quaternion wavelet transform (QWT) and normalized cut [72]
algorithm NSCT and Focus area detection [73]
Surface area based [74]
shearlet and graph based visual system (GBVS) [75]
NSCT and Frequency Tuned (FT) [76]

T
In the region partition based algorithms, the first step is to partition the input images
into distinct regions by employing standard segmentation techniques. By considering the

IP
characteristics of the regions, fusion rules are employed to get the fused image. In general,

CR
the different approaches of multiresolution, ICA based transform, optimization etc. are
incorporated with the segmentation algorithms. Some flow diagrams of the existing methods

US
are shown. The flow diagram of region partition algorithm based method explained in [49] is
depicted in Fig. 6.
AN
Region
segmentation
M

IMF and residue

Image 1
ED

Composite
Region based BEMD Fused Image
Fusion Rules representation

IMF and residue

Image 2
PT

Region
segmentation
CE
AC

Fig. 6. Flow diagram of region based BEMD fusion scheme [49].

The region based image fusion technique was firstly suggested by Zhang and Blum
[24]. The authors in their work combined synergistically the pixel and feature based fusion.
The wavelet transform of the source images is merged to produce the fused image. The
authors identified edges and region of interest (ROI) as the important features for guiding the
fusion process. It is to be noted that this approach involves division of a series of images at
discrete resolutions. This problem was addressed by Piella [26,27]. The author proposed a

9
ACCEPTED MANUSCRIPT
general framework for pixel and region based image fusion utilizing multiresolution
approach. The segmentation stage is improved by performing a single segmentation from all
the source images in a multiresolution fashion. The images are fused following the additive
or weighted combination fusion rule. However, the author has not optimized the performance
and investigated the effect of different parameters on the fusion process. Further, the regions
are combined based on simple region property like average activity. The DWT lacks shift
invariance and directionality property.
To overcome these problems, Lewis et al. reviewed a lot of pixel level fusion
algorithms (using averaging, pyramids, DWT, DT-CWT). The authors compared their

T
findings with a new region based technique [28]. The authors used DT-CWT for

IP
segmentation of the features of the input images to develop a region map. The properties of

CR
every region are determined, and the fusion is performed region by region utilizing region
based approach in the wavelet domain. However, the authors pointed towards the
improvement of higher level region based fusion rules. The quality of the fused image may

US
deteriorate due to the inverse DT-CWT transform. To improve the fusion results, Zaveri and
Zaveri [29] used highboost filter concept with DWT for the fusion. Their proposed technique
AN
overcomes the shift variance problem; as inverse wavelet transform is not needed in the
algorithm. The highboost filter is used to get accurate segmented image. The segmented
image is utilized for obtaining the regions from the input images. The extracted regions are
M

fused using the fusion rule i.e. mean max standard deviation and spatial frequency. However,
they concluded that, to enhance the robustness of the method, complex fusion rules may be
ED

developed.
Many researchers have also used the NSCT approach to image fusion. A region based
PT

image fusion procedure for infrared (IR) and visible image using NSCT is suggested by Guo
et al. [30]. The features of the target region are used to segment the IR image. The input
CE

images are divided using NSCT. The target region and the background regions are merged
using different fusion rules. The inverse NSCT is used to get the fused image. However, the
AC

fusion process involved has a high computational complexity. Zheng et al. used NSCT and
fuzzy logic for the fusion of IR and visible image [31]. Firstly, the required input images are
segmented using the live wire segmentation method followed by decomposition using NSCT
transform. The fuzzy fusion method is used in the low frequency domain and region based
rule is used in high frequency domain. The inverse NSCT is applied to get the fused image.
To effectively analyse the geometric structure of remote images with computationally
efficient approach , a SIST and regional selection algorithm is suggested by Luo et al. [32] for
remote sensing image fusion application. The feature vectors from MS and PAN images are

10
ACCEPTED MANUSCRIPT
divided into regions using fuzzy c-means (FCM) clustering. Based on the regional similarity,
an adaptive multi-strategy fusion rule of high frequency band is suggested. Finally, the fused
image is found by taking the inverse SIST and inverse entropy component analysis.
The local features of an image are usually not considered for fusion. To resolve this,
Wang et al. suggested a fusion procedure based on the DWFT and regional characteristics
[33]. The transform coefficients are found from the two input images using the DWFT.
Taking the mean of the transform coefficients, an average image is obtained which represents
the approximate features of the input images. The average image is segmented based on the
region features to get the region coordinates. The coefficients and the region coordinates are

T
mapped. The fused image is produced by combining the coefficients of each region using

IP
suitable fusion rules. However, it needs a relatively long time for processing due to the image

CR
segmentation process involved.
Researchers have extensively used ICA for image fusion [34-42]. Mitianoudis and
Stathaki [34] used ICA for region based image fusion. The authors investigated the

US
effectiveness of a transform utilizing ICA and topographic ICA bases in image fusion. The
fused image in ICA domain is obtained by utilizing new pixel and region based fusion rules.
AN
The suggested method demonstrated better performance as compared to conventional wavelet
approaches at the cost of slightly more computational load. Contrary to the suggested
framework, Cvejic et al. used ICA bases for region based multimodal image fusion [35, 36].
M

The authors used different training subsets to identify the most significant regions in the
source images. Subsequently, they combined the ICA coefficients utilizing the fusion metrics
ED

for enhanced results. In [35], a combined approach is used to obtain the most significant
regions from the source images. The fused image is obtained by merging the ICA coefficients
PT

from the obtained regions. The authors used the Piella metric to enhance the quality of
results. The performance improved with an increase in the computational complexity.
CE

These proposed methods exhibit the following drawbacks: (i) the approaches cannot
be easily extended to multiple sensor fusion application, (ii) there is no theoretical
AC

justification for the presence of the global optimum of the objective function derived from the
Piella index. To overcome this, in [37], the authors used optimization of the Piella index for
multiple input sensors. To get the optimal solution, the authors revisited the previously
proposed work in [34] and further suggested a method to find the optimum intensity range
through optimization of a fusion index. The suggested method improves the original ICA-
based framework and produces a better fused image. In [38], the authors extended their work
to a more advanced region based fusion approach. A group of fusion rules using textural info
is presented. The suggested method enhances the performance than the max-abs fusion rule in

11
ACCEPTED MANUSCRIPT
case of multifocus images. On the contrary, it was not so good for multimodal image fusion.
The reason may be the various modality images have different texture properties. In [39],
Omar et al. used the combination of Chebyshev polynomial (CP) and ICA depending on
regional information of source images. The proposed method used segmentation technique to
identify features such as texture, edge, etc. The fused image is found using distinct fusion
rules as per the selected regions. The advantage of this method is that it offers an autonomous
denoising property, combining the benefits of both CP and ICA. Nirmala et al. proposed a
region based multimodal (visible and Infrared ) image fusion scheme using ICA and SVM
[40]. The source images are jointly segmented in the spatial domain. The significant features

T
of every region are calculated. The ICA coefficients of the particular regions are combined to

IP
form the fused ICA representation. As the ICA bases are computed and SVM is trained, the

CR
suggested technique may seem to multiply the computational load.
In recent studies, many optimization methods have been utilized by researchers for
image fusion. Neural networks (NN) have been extensively employed for image fusion [43-

US
45]. Hsu et al. suggested a multi-sensor image fusion method using ANN, which merges the
features of the feature and pixel level fusion scheme [43]. The basic concept is to segment
AN
only the far infrared image. The information from each segmented region is added to the
visual image. The different fused parameters are determined according to the different
regions. In [46], PCNN has been utilized for the fusion of multi-sensor images. The
M

procedure begins with the segmentation of source images using PCNN and the output is used
to direct the fusion process. The suggested method outperforms the pixel based methods in
ED

terms of blurring effect, sensitivity to noise and misregistration. Saeedi and Faez [47]
proposed a fusion of visible and IR image using fuzzy logic and PSO. The high frequency
PT

wavelet coefficient of IR and visible images is fused using fuzzy based approach. The PSO is
suggested for the low frequency fusion rule. The low frequency and high frequency parts of
CE

the wavelet coefficient are fused. The authors have not considered noise in their work. It is
evident that for noisy images, the segmentation process will result in over segmentation with
AC

inaccurate regions. Aslantas et al. described a new method for thermal and visible images
[48]. Instead of using a single weighting factor, the authors have used multiple weighting
factors for distinct regions to get the fused image. The weighting factors are optimized using
DE. A new image fusion metric – sum of correlation difference is formed to assess the
performance of the fused images during the optimization procedure.
Some other methods have also been proposed by many researchers. Zheng and Qin
used BEMD technique for region based image fusion [49]. The source images are partitioned
into several intrinsic mode functions and a residual image. The process of fusion is carried

12
ACCEPTED MANUSCRIPT
out as per the segmentation of the source images, which produces a combined BEMD
representation. The fused image is found by using the inverse BEMD. Because of the finite
length of wavelet function, DWT induces energy leaking. This problem does not occur in
BEMD because it is considered as an adaptive highpass filter. Li and Yang [50] proposed a
novel method for multifocus images utilizing region segmentation and spatial frequency. The
normalized cut method is utilized to segment the intermediate fused image. The two input
images are segmented as per the segmenting results of the intermediary fused image. The
fused image is obtained by merging the segmented regions using spatial frequency. The
advantage of the suggested technique is that it does not use the multiresolution approach, as

T
few information of the input image may be missed while performing the inverse

IP
multiresolution operation. The limitation of the proposed method is that the computational

CR
time is more as compared to the wavelet based approaches, as the segmentation procedure
takes more time.
In most of the papers, a common segmentation technique for different input images

US
taken from distinct sensors have not been used for region based image fusion. These
approaches are restricted to the particular input images only. Luo et al. proposed the method
AN
of region partition strategy in which the segmentation is performed on the similar features of
input images (irrespective of the kind of input images) [51]. The complementary and
redundant correlations of the source images are distinguished by using fusion methods. On
M

the basis of the similar components, the small homogeneous regions are merged. The final
region map is obtained by comparing the resemblance between the images. The method has
ED

several advantages like the generality of application, superior visual perception and simple
realization without parameter setting.
PT

Chen and Qin [52] used CS theory in the region based fusion framework. The authors
considered both compression capabilities of sensors and the intelligent understanding of the
CE

image features for fusion. In dynamic scene, it is very hard to accurately determine whether a
pixel or region is blurry or not by utilizing only the focus information only. Besides, another
AC

limitation of the pixel based method is that the fusion results obtained is not accurate when
the image patterns become complex. In contrast to this, Li et al. proposed image matting
fusion technique for the fusion of multifocus images in dynamic scenes [53]. The algorithm
uses morphological filters to get rough segmentation results followed by image matting to
obtain the accurate focussed regions. The fused image is found by merging the focussed
regions. Chen et al. [54] used the SR method for the fusion of multifocus images. The
suggested technique merges the advantages of regional and sparse representation based
fusion to obtain the fused image. The fusion of high resolution images using mean shift

13
ACCEPTED MANUSCRIPT
segmentation method is described in [55]. The authors used SSIM for the measurement of
regional similarity.
These methods need a precise selection of segmentation technique. The choice of
segmentation technique is crucial to obtain a fused image. In these approaches, the regions
are chosen and fused based on some regional characteristics. The significance of the
statistical characteristics of the regions has not been considered, which is utilized to enhance
the precision of the decision process in image fusion applications.
Most of the region based image fusion procedures suggested by different researchers
have not applied the estimation theory approach rigorously. In the statistical and estimation

T
based algorithms, first the input images are partitioned into regions by using some

IP
sophisticated region segmentation algorithms. A joint region map is developed by analysing

CR
the region map of each source image to produce the fused image. A statistical image
formation model is developed for every region in the joint region map. The estimation
procedure is utilized in combination with the model to build an iterative fusion process to

US
determine the model constraints and to generate the fused image. A typical flow diagram of
statistical and estimation based algorithm explained in [58] is shown in Fig.7.
AN
M

Region
Image 1
Segmentation

Region EM Fused
ED

Analysis Fusion Image

Region
Image 2
Segmentation
PT
CE

Fig.7. Flow diagram of region based EM fusion scheme [58].

Sharma et al. [56] suggested a Bayesian fusion scheme inspired by estimation theory.
AC

A statistical signal processing method to image fusion is suggested by Yang and Blum in
[57]. Many researchers have proposed image fusion using the EM algorithms [58-60]. In
[58], the authors used the EM algorithm, which utilizes the features of regions for fusion in
an optimal way. Each region is built by a statistical image formation model. The region-level
EM algorithm is developed by using the EM fusion procedure in combination with the model
to produce the fused image. Zhang and Ge [61,62] proposed image fusion method employing
energy estimation approach. To partition the source images into regions on the basis of

14
ACCEPTED MANUSCRIPT
homogeneity, piecewise smooth Mumford and Shah Model is used. To speed up
convergence, a level set based optimization procedure is combined with it. The fusion quality
is determined using an energy model. The methods proposed in [27,28,58], fail to consider
the importance of the statistical features of regions, which is utilized to enhance the precision
of the decision process in image fusion applications. In some of the research articles,
statistical model for fusion and the segmentation of images are integrated. Wan et al.
integrated the multi-scale image segmentation technique and statistical feature extraction in
the suggested framework [63]. A region map of input images is obtained using DT-CWT and
statistical region combining algorithm. The source images are divided into significant regions

T
that contain salient information using symmetric alpha stable distribution. By employing the

IP
BαS technique, region features are modelled. The fused image is obtained by applying a

CR
segmented driven approach in the complex wavelet domain.
In most of the practical applications, the dimensions of the training data set are too
large which accounts for a large computation time in statistical image fusion. This constraint

US
requires data reduction. This is achieved by choosing a suitable subset of the prime training
data set without compromising the image fusion precision appreciably. Zribi [64] proposed a
AN
non-parametric and region based image fusion method following the principle of bootstrap
sampling. This brings down the dependence effect of pixels in true image and also reduces
the fusion time. The image sensors are expressed as a real scene degraded by distortion in the
M

statistical image formation model. The authors used a non-parametric EM procedure to

determine the model constraints and the fused image. However, the proposed method utilized
ED

only two source images for fusion.

These methods characterized the segmented regions using the statistical and
PT

estimation approaches. The quality of image segmentation is vital for determining accurate
segmented regions. The ROI and boundary detection of focused regions is not accurately
CE

obtained with the methods discussed above. Many researchers have also used saliency map
based algorithms for region based image fusion. In general, salient object detection is an
AC

image/background segmentation problem and aims at the separation of the significant

foreground object from the background. It is marginally different from the conventional
segmentation procedure in which image is segmented into the perceptually coherent regions.
To select an optimal region extraction method, researchers have compared many saliency
analytical methods [65-69]. Fig. 8 shows the flow diagram of focus region detection and
saliency map based algorithm described in [76].

15
ACCEPTED MANUSCRIPT
Infrared Visible light
Image Image

Low High Low High

Saliency map
Frequency Frequency Frequency Frequency

Free regions Fusion Fusion

removal rule rule

T
Fused low Fused high
frequency frequency

IP
CR
Primary fused
Object region
image

Final fused
image
US
AN
Fig. 8. Flow diagram of region based image fusion scheme of NSCT based method [76].

The problem of the spatial domain based methods is that they may generate artefacts
M

or imprecise results at the focused border areas, as the boundary of focused areas cannot be
estimated correctly. This problem is solved by the use of multi-scale transform method.
ED

However, the problem with this method is to choose the proper fusion rule. Additionally,
some information of the input image may be lost while implementing the inverse multi-scale
PT

transform. So as to avoid these problems, the authors proposed methods combining the
benefits of spatial region based and transform domain [70,71]. Chai et al. suggested a
CE

multifocus image fusion technique utilizing the focus region detection and multiresolution
approach [70]. The authors have used the focus region detection and LSWT to combine the
benefits of spatial and transform domain based fusion approaches. The idea of local visibility
AC

(LV) in LSWT domain is used for the fusion of the lowpass subband coefficients. The sum
modified Laplacian inspired local visual contrast rule is applied for the fusion of the highpass
subband coefficients. However, the fused image consists of several imprecise results at the
boundary of the focused region. To overcome this, the authors proposed a multifocus image
fusion method based on NSCT and focus region detection [71].
The image visibility rule in NSCT domain is used for the fusion of lowpass subband
coefficients. The local area spatial frequency rule is employed for the fusion of highpass

16
ACCEPTED MANUSCRIPT
subband coefficients. The limitation of the methods is that the post- processing phase
utilizing morphological procedure is not robust. In order to avoid this problem, Liu et al.
suggested a multifocus image fusion utilizing QWT [72]. Initially, the authors used the local
variance of the phases to identify the focus or defocus for each pixel. They segmented the
focus detection result using normalized cut to eliminate the detected errors. Finally, the fused
image is found utilizing the spatial frequency as fusion weight along the edge of the focus
region. However, the method may create false information and irregular phenomenon at the
boundaries of the focused areas, as the boundary cannot be estimated precisely. To perfectly
determine the boundaries of the focused region, Yang et al. suggested a novel hybrid

T
multifocus image fusion method based on NSCT and focus area detection [73]. The authors

IP
modified their work in the sense that they used the log Gabor filter for high frequency

CR
subband coefficients. The limitation of the method is that the NSCT procedure is consuming
more time. Nejati et al. proposed a new focus measure depending on the surface area of the
region surrounded by juncture points for multifocus image fusion [74]. The objective of this

US
metric is to differentiate focus regions from blurred regions. The juncture points of the source
images are computed and segmented utilizing these juncture points. The surface area of every
AN
region is used as a quantity to get the focused regions. An initial selection map for the fusion
is obtained using this measure which is refined using morphological operations.
The most challenges task in region based image fusion is the proper image
M

segmentation. In recent years, numerous state-of-the-art saliency region detection approaches

have been suggested. Still, there are some shortcomings with them. The most noticeable
ED

among them is the salient region detection approaches which may focus the object region as
well as some of the background region. Detection of the visual salient region has been an
PT

ongoing research process for a long time. The saliency map is an emerging technique to
identify the salient region, which can overcome the above problem. Zhang et al. suggested a
CE

multifocus image fusion technique based on focus region extraction. The authors used
saliency analysis [75]. The GBVS procedure is utilized to identify the focused region in the
AC

input image. Subsequently, watershed and morphological techniques are employed to find the
bounded area of saliency map and eliminate the pseudo focus area. In the final step, the
focused area is fused straight and the residual area are fused using shearlet transform.
In [76], Meng et al. suggested a new procedure based on object region detection and
the NSCT. The FT saliency detection map is used to acquire the saliency map for the IR
image. The significant object region is extracted from the IR image using free regions
removal method. The input images are divided via NSCT and distinctive fusion rules are
applied for the lowpass and highpass subband coefficients. The inverse NSCT is utilized to

17
ACCEPTED MANUSCRIPT
produce a primary fused image. At last, the final fused image is found by combining the
primary fused image with the object region. Again in [77], Meng et al. presented the region
based image fusion technique to merge IR and visible image by employing the saliency map
and interest point. A saliency map is built using a saliency detection process for IR image.
Further, it is explored to identify interest points. To get the salient region, a convex hull of the
salient interest point is computed. The first saliency map is developed by merging the convex
hull of the salient interest points. Finally, the various fusion rules are employed for object
region and background. In [78], Han et al. proposed a saliency aware fusion procedure for
multimodal image fusion (IR and visible image) to enrich the visualization of the visible

T
image. The process is used for saliency recognition followed by a bias fusion. The

IP
information of these two sources is combined using the Markov random fields. The following

CR
fusion phase is used to bias the end result favouring the visible image, excepting when a
region has distinct IR saliency. The fused image represents both salient foreground from the
IR image and background as provided by the visible image. The fused image obtained with

US
these methods has significant improvement compared to other methods, because it avoids the
traditional segmentation process. The transform used in these methods may consume more
AN
time. Still, the different methods have their own importance. The further improvement in the
fusion quality may require a suitable selection of the regional features.
The region partition based algorithms are simpler than the other two approaches –
M

statistical and estimation based, focus region detection and saliency map based. The region
partition based algorithms need a precise selection of segmentation technique. The choice of
ED

segmentation technique is crucial to obtain a fused image. In these approaches, the regions
are selected and fused based on certain regional characteristics. However, the importance of
PT

the statistical characteristics of the regions are not considered, which is utilized to enhance
the precision of the decision process in image fusion applications. The statistical and
CE

estimation approach uses the statistical characteristics to improve the precision of the
decision process. It is needed to choose an efficient image segmentation technique for
AC

determining accurate segmented regions. However, the ROI and boundary detection of
focused regions is not accurately obtained with these approaches. The focus region detection
and saliency map approaches uses the ROI and boundary detection to accurately obtain the
fused image. The selection of segmentation technique is not vital in these approaches.
However, these approaches are complex and application specific.

18
ACCEPTED MANUSCRIPT
3. Performance evaluation
There are two methods in which the quality of the fused image is computed i.e.,
qualitative or subjective and quantitative or objective. Qualitative analysis means visual
analysis in which the fused image is matched with the source images by a group of observers.
The analysis of the fused image considers different optical parameters like spatial details, the
size of the object, colour etc. The qualitative analysis is typically accurate if it is done
correctly. However, these methods are inconvenient, expensive and consumes more time. It is
a very difficult task in most of the image fusion applications due to lack of availability of a

T
ground truth image that is perfectly fused. So, another technique to calculate the fusion
performance is the quantitative or objective evaluation. But again, it remains an issue as how

IP
to measure the performances of the fused image objectively. In this survey, the quantitative

CR
fusion assessment metrics are classified in two groups based on the presence or absence of a
reference fused image.

US
3.1. Objective fusion evaluation metrics with reference image

The reference image is the ideal fused image that is taken as a ground truth image for
AN
validating the image fusion algorithm. The ground truth image may be available or manually
constructed. A list of commonly used objective fusion evaluation metrics with reference
image is illustrated in Table 2. The quality metrics employed to compute the fusion
M

performance are peak signal to noise ratio (PSNR), root mean square error (RMSE), mutual
information (MI), structural similarity index measure (SSIM), correlation coefficient (CC)
ED

and universal quality index (UQI). The symbols used in the expressions carry the same
meaning as depicted in the corresponding reference papers mentioned in table 2 and 3.
PT

Table 2
Objective fusion evaluation metrics with reference image.
CE

Sl. Quality Description Formula Reference

No. metric

1 PSNR It is calculated as the ratio of   [79]

 
number of intensity levels in PSNR  20 log10 
L2 
the image to the related pixels  1 
  I  i, j   I  i, j  
M N
2
 MN r f 
in the ideal and the fused  i 1 j 1 
image. A higher PSNR value
indicates superior fusion.
2 RMSE It estimates the quality of the M 1 N 1 [80]
  R  i, j   F  i, j  
1 2
fused image by relating the RMSE 
MN i 0 j 1
ideal and the fused image.
The value of RMSE should be
lower i.e. close to zero for a
good fused image.

19
ACCEPTED MANUSCRIPT
3 MI The similarity of image MI FAB  MI FA  MI FB [81]
intensity between ideal and
fused image is measured
using mutual information. MI
should be high for a better
fusion performance.
4 SSIM The SSIM is employed to
SSIM ( x, y ) 
 2  x y  C1  2  C xy 2
[82]
calculate the quality of one
image compared with the
 2
x   y2  C     C 
1
2
x
2
y 2

other image, provided that the

other image is of perfect
quality. SSIM should be high
between [-1, 1].

T
5 CC It is used to compute the
spectral feature similarity
  X i, j 
 x   Xˆ i , j  xˆ  .
  [83]

IP
i, j
CC 
between the reference and the
  X 
 x      Xˆ i , j  xˆ  
2 2

fused image. The value of CC i, j

 i , j  

CR
i, j
should be high i.e. close to 1.
6 UQI The UQI is used to calculate  xy 2x y 2 x y [84]
how much amount of relevant UQI 
 x y x  y
    x2   y2
2 2
data is transformed from ideal
image into the fused image.
The value of this metric
US
AN
ranges between -1 to 1.

3.2 Objective fusion evaluation metrics without reference image

In very rare case, the ground truth image is available. Hence, it is highly desirable to
assess the quality of the fused image without taking the ideal image. The meaning of quality
ED

metric with the reference image is that either the ground truth image or the ideal image is
available. The meaning of quality metric without the reference image is that either the ground
PT

truth image or the ideal image is not available. In such cases the quality metric is computed
using the source or input images and the output or fused image. A commonly used list of the
CE

objective fusion evaluation metrics without ideal/reference image is illustrated in Table 3.

The quality metrics used to evaluate the fusion performance are standard deviation (SD),
AC

entropy (H), cross entropy (CE), spatial frequency (SF), fusion mutual information (FMI),
sum of the correlation of difference (SCD), Piella metric and Petrovic metric.
Table 3
Objective fusion evaluation metrics without reference image.
Sl. Quality Description Formula References
No. metric

20
ACCEPTED MANUSCRIPT
1 SD The change in intensity [85]
  
M N 2
variation in the fused image is f  i, j   
i 1 j 1
measured using standard SD 
MN
deviation. The SD value should
be high.
2 H The information content of the L 1 [85]
fused image is computed using H   Pi log Pi
i 0
entropy. The fused image
containing rich information has
high entropy value.
3 CE To know the resemblance in D  hA hF   D  hB hF  [86]
information content between the CE  A, B, F  
input and fused image, cross 2

T
entropy is employed. The cross
entropy should be low.

IP
4 SF Spatial frequency is measured [87]
 RF    CF 
2 2
SF 
only using the fused image. It is

CR
measured by calculating the
row and column frequency of
the fused image. The value of

5 FMI
SF should be high.
It is utilized to compute the
level of dependency between
the input and the fused image.
US
FMI  I FA  I FB [86]
AN
A larger value of FMI indicates
an improved quality of the
fused image.
6 SCD The sum of the correlation of SCD  r  D1 , S1   r  D2 , S2  [88]
M

differences indicates how much

of information is transmitted
from source images to the fused
ED

image. For better fusion

performance, the SCD should
be high.
PT

7 Piella Piella’s metric estimates the Q0  a, b   [89]

metric way the salient information is
    w Q  a, f w  1    w Q b, f w
1
(Q0, presented in the fused image W wW
0 0
CE

QW, using the local measurement

QE) i.e., image correlation QW  a, b, f 
coefficient, mean luminance
and contrast. It includes the    
  C  w   w Q0 a, f w  1    w Q0 b, f w   
AC

wW
significance of edge
information. The dynamic range
of Piella’s metric is [0,1] and QE  a, b, f 
the value should be near to 1 for
 QW  a, b, f   QW  a , b , f  

improve fusion performance.
8 Petrovic It provides the matching Q AB / F  [90]
metric between the edges transmitted N M

  Q  n, m  wA  n, m   Q  n, m  wB  n, m 
AF BF

(QAB/F) in the fusion procedure. The n 1 m 1

dynamic range of QAB/F is [0, 1].   wA  i, j   wB  i, j  

N M

i 1 J 1

21
ACCEPTED MANUSCRIPT
4. Discussions
Image fusion is very important and beneficial for various image processing steps such
as object extraction, identification and computer vision. It is a tedious job due to
misregistration, distortion and other artefacts. Recently, in many applications like
photography [91-93], medical diagnosis [94-96], surveillance [97-99], remote sensing [100-
102] etc., the region based image fusion has been widely used. Different methods are
proposed in the literature for region based image fusion, as discussed in section 2. A
comparison among various methods is quite a difficult task, as they use different modalities,
databases and performance indices. The assessment of quality of fused image is carried out in

T
two ways i.e. qualitative and quantitative. In this paper, a comparison of various methods is

IP
carried out for different applications using quantitative and qualitative analysis. This

CR
comparison would benefit the researchers to apply different methods in the various
applications.
The quantitative assessment for the existing methods is provided in form of tables.

US
The different tables are prepared after considering different references using similar test
conditions, same application and using the same source images. The symbol ‘–’ in all the
AN
tables represent unavailability of the particular information. The particular performance
metric is not evaluated due to non-availability of the fused image. A comparison of different
M

methods for multifocus image fusion application is illustrated in Table 4.

Table 4
Comparison of different methods for Multifocus images (Reference: Clock Image).
ED

Method MI QAB/F H SF Q0 Qw QE
RSSF [29] 6.9279 0.7119 8.7813 10.3350 0.7138 0.8312 0.6618
BEMD [49] 5.9016 0.6250 7.4562 8.8864 0.6170 0.7690 0.6500
PT

RF-SSIM [51] 7.0572 0.4285 7.4262 9.1006 0.7608 0.8516 0.6652

BEMD [51] 6.1665 0.4830 7.3462 9.1925 0.7133 0.8342 0.6632
RSSF [51] 8.6973 0.7032 7.3691 8.9858 0.7628 0.8212 0.6516
CE

DWT and highboost 7.7344 0.7018 8.8066 10.0048 0.7608 0.8119 0.6521
filter [29]
SR [54] 5.6106 0.7533 7.0657 8.4682 0.6911 0.7814 0.5918
AC

CS [52] 4.7918 0.4261 7.4123 8.5618 0.6124 0.7764 0.4939

LSWT [70] 8.5518 0.7246 7.1549 8.0456 0.6818 0.7631 0.5825
QWT and 8.9971 0.7443 7.3419 8.3981 0.7632 0.8428 0.6649
normalized cut [72]
NSCT and focus 8.6580 0.7502 7.2906 8.4729 0.7529 0.8341 0.5823
area detection [73]
Surface area based 8.8280 0.7400 - 13.6500 - - -
[74]
Shearlet and GBVS 7.8397 0.7168 7.4337 8.7078 0.6812 0.7355 0.5862
[75]

22
ACCEPTED MANUSCRIPT

It is seen that most of the methods perform well for the multifocus fusion example in
terms of MI. However, the methods listed under the focus region detection group performs
better as compared to other methods. A similar trend is observed for the parameter QAB/F. For
instance, QWT and normalized cut, NSCT and focus area detection and Surface area based
methods give a value of 8.9971, 8.6580, and 8.8280 respectively for MI. The reason may be
(i) presence of superiorities such as multiresolution, multidirection and shift-invariance, (ii)
use of the detected focus area as a fusion decision map to drive the fusion process, (iii)
preventing artefact and imprecise results at the boundary of the focus region. Further, few

T
researchers have computed the remaining metrics. So, these metrics may be calculated for the

IP
given methods for a fair comparison. A comparison of various methods for IR and visible

CR
image fusion application is illustrated in Table 5.

Table 5

Method QAB/F MI H
US
Comparison of different methods for IR and visible images (Reference: UN Camp Image).
SD Piella Qw
AN
ICA-SVM [40] 0.6100 7.1700 7.1800 31.4100 0.9500 0.7118
DT-CWT [40] 0.4900 2.9800 6.8300 32.9900 0.9100 0.6512
Region based ICA 0.5700 4.1600 6.5300 30.4100 0.9200 0.6602
[40]
M

BαS[63] 0.5106 3.0889 6.5181 42.1618 0.9210 0.7087

DT-CWT [63] 0.5069 2.6980 6.4490 41.0710 0.8912 0.6829
Region based ICA 0.6000 3.0519 6.5657 33.6952 0.9300 0.7018
ED

[36]
DT-CWT [36] 0.4600 3.0173 6.6727 34.1523 0.9100 0.6835
FPSO [47] 0.5520 2.7028 7.4100 32.3155 0.9128 0.6918
DT-CWT [47] 0.5030 3.1603 6.6600 36.9536 0.8910 0.6721
PT

It is observed that the ICA based methods outperform other methods in the IR and
CE

visible image fusion application. For instance, the ICA-SVM method gives a value of 0.61
for QAB/F, 7.1700 for MI, 7.1800 for H, 31.4100 for SD, 0.9500 for the Piella metric and
AC

0.7118 for Qw which is better as compared to other methods. The reason may be the ICA
based methods perform segmentation and uses numerous statistical properties of the regions
to make intelligent decisions. The researchers have not computed most of the metric values.
Therefore, these values may be computed for the above mentioned methods for a fair
comparison. A comparison of different methods for medical image fusion application is
depicted in Table 6.
Table 6
Comparison of different methods for Medical images (Reference: CT and MRI).

23
ACCEPTED MANUSCRIPT
Method MI QAB/F Q0 Qw QE H
RF-SSIM [51] 5.6274 0.6259 0.8928 0.8027 0.6283 6.8614
RSSF [51] 3.4326 0.5017 0.4372 0.4034 0.4128 5.4359
BEMD [51] 2.1028 0.4604 0.5293 0.7145 0.5612 6.8879
CS [52] 2.8958 0.5195 0.5634 0.7231 0.6275 6.3807
BαS [63] - 0.6808 - 0.7541 - -
DT-CWT [63] - 0.6339 - 0.6722 - -

The results indicated in the table shows that the region partition based algorithms
perform better than the other methods. For example, the RF-SSIM method gives a value of
5.6274 for MI, 0.8928 for Q0 , 0.8027 for Qw, 0.6283 for QE and 6.8614 for H which is better

T
than the other methods. The reason may be the methods merge the homogenous regions

IP
based on SSIM. The statistical and estimation based algorithms also show promising results

CR
but using other metrics. So researchers may compute these metrics for a fair comparison.
The subjective or qualitative evaluation reflects the visual perception of the image,
which varies from viewer to viewer. It is evaluated properly by the experts who have long

US
experience in the field. However, this type of evaluation is not always preferred, as it is
inconvenient and time consuming. The fused images for different image fusion applications
AN
are depicted below. The experimental results obtained for clock images carried out by
different researchers are shown in Fig. 9.
M
ED

(a) (b) (c) (d)

PT
CE

(e) (f) (g) (h)

(i) (j) (k) (l)

24
ACCEPTED MANUSCRIPT

(m) (n) (o) (p)

Fig. 9. Experimental results for clock images, (a) right focus, (b) Left focus, (c) QWT and
normalized cut [72], (d) CS [52], (e) shearlet and GBVS [75], (f) NSCT and focused area
detection [73], (g) LSWT [70], (h) SR [54], (i) BEMD [49], (j) BEMD [51], (k) DWFT
[33],(l) RSSF [29], (m) RF-SSIM [51], (n) RSSF [51], (o) RSSF [54], (p) DWT and
highboost filter [29].
As shown in Fig. 9, the input images are represented in (a) and (b). In (a), the left

T
clock is out of focus and in (b) the right clock is out of focus. The fused images of different

IP
methods are illustrated in fig (c)-(p). It is observed that almost all the images look similar.
Therefore, the subjective evaluation, in general, is treated as an ineffective tool for

CR
comparison. However, it is seen that the visual quality of the images in (c), (e), (f) and (g) is
better as compared to other methods. The image in (c) and (f) contain sharp edges at the

US
upper portion of the right clock. The fused image obtained with these methods have retained
more relevant information of the source images. In (i) and (j), the edge at the top of the right
AN
clock is not sharp i.e. fractional edge information is lost. Further, the images in (d) and (k)
have blurry clock hands. The experimental results obtained for IR and visible image fusion
application carried out by different researchers are shown in Fig. 10.
M
ED
PT

(a) (b) (c) (d)

CE
AC

(e) (f) (g) (h)

(i) (j) (k)

Fig. 10. Experimental results for IR and visible image fusion application, (a) Visible image,
(b) IR image, (c) FPSO [47], (d) ICA-SVM [40], (e) BαS [63], (f) Region based ICA [36], (g)
Region based ICA [40], (h) DT-CWT [40], (i) DT CWT [63], (j) DT-CWT [47],(k)DT-CWT
[36]

25
ACCEPTED MANUSCRIPT

As shown in Fig. 10, (a) represents the visible image and (b) represents the IR image.
The fused images from different methods are shown in (c)-(j). It is observed that the person is
identifiable in (b) and trees and fence are visible in (a). The fused images in (d),(f),(g), (h)
and (k) are demonstrating more details as compared to other methods. However, the images
in rest of the methods do not show the details clearly. In some cases such as (i), the person’s
image is blurry. The experimental results obtained for medical image fusion application
carried out by different researchers are shown in Fig. 11.

T
IP
CR
(a) (b) (c) (d)

US
AN
(e) (f)
Fig. 11. Experimental results for medical image fusion application, (a) CT image, (b) MRI
image, (c) CS [52], (d) RF-SSIM [51], (e) RSSF [51], (f) BEMD [51].
M

As presented in Fig. 11, (a) represents the CT image, (b) represents the MRI image.
ED

The fused images from different methods are shown in (c)-(f). It is observed that the image in
(d) is more prominent as compared to other methods. The method preserved more salient
PT

source information. In the middle portion of (e), some part is lost. Further, some artefacts are
introduced at the top portion of the fused image (f). It is wise to reiterate the fact that the
CE

reason of choosing the special fusion methods for multifocus, IR, visible, and for medical
images are – i) the tables in this work are prepared application wise for showing a fair
comparison among different methods, ii) the respective fusion methods are chosen based on
AC

including more number of quality metrics, and iii) the fusion methods are grouped based on
the results available using the same input images.

5. CONCLUSION
In this paper, we have presented a survey on region based image fusion methods. The
region based image fusion algorithms are classified, for the first time, into three classes:
region partition based, statistical and estimation based, and focus region detection and

26
ACCEPTED MANUSCRIPT
saliency map based. A comparison of different methods in terms of various metrics for
different applications is done. Based on this comparison, an idea about the various image
fusion methods is developed for different applications. The focus region detection and
saliency map based algorithm is mostly suitable for the multifocus image fusion applications.
The ICA based methods perform better in case of the multimodality image fusion
applications. The region partition algorithms are used in medical image fusion applications
producing better fusion results. It is observed that the saliency map method is an emerging
technique that can be used in many applications. The quality assessment metrics QAB/F and
MI is mostly preferred in all the applications. However, the other metrics can also be

T
computed for a fair comparison. The problems existing in different methods is discussed. The

IP
survey carried out in this paper may help the researchers in further research in the domain of

CR
region based image fusion.

US
AN
M
ED
PT
CE
AC

27
ACCEPTED MANUSCRIPT

References

[1] A.Goshtasby, S. Nikolov, Image fusion: Advances in the state of the art,
Information Fusion. 8 (2) (2007) 114-118.
[2] H. Shen, X. Meng, L. Zhang, An integrated framework for the spatio-temporal-
spectral fusion of remote sensing images, IEEE Transactions on Geoscience and
Remote Sensing. 54(12) (2016) 7135-48.
[3] X. Meng , J. Li, H. Shen , l. Zhang , H. Zhang , Pansharpening with a guided filter
based on three-layer decomposition, Sensors. 16(7) (2016) 1068.
[4] C.Thomas, T. Ranchin, L. Wald, J. Chanussot, Synthesis of multispectral images to
high spatial resolution: A critical review of fusion methods based on remote sensing

T
physics, IEEE Transactions on Geoscience and Remote Sensing. 46(5) (2008)1301-

IP
1312.
[5] S. Li, X. Kang, L. Fang, J. Hu, H. Yin, Pixel-level image fusion: A survey of the
state of the art, Information Fusion. 33 (2017) 100-112.

CR
[6] B. Yang, Z.L. Jing, H.T. Zhao, Review of pixel-level image fusion, J. Shanghai
Jiaotong Univ. 15 (2010) 6-12.
[7] H. Li, L. Li, and J. Zhang, Multi-focus image fusion based on sparse feature matrix

[8] US
decomposition and morphological filtering, Opt. Commun.,342 (2015) 1-11.
F. Mirzapur, H. Ghassemian, Improving hyperspectral image classification by
combining spectral, texture, and shape features, International Journal of Remote
Sensing, 36 (4) (2015)1070–1096.
AN
[9] S. S. Malik, S.P. Kumar, G.B. Maruthi, DT-CWT: Feature level image fusion based
on dual-tree complex wavelet transform, in: Proceedings of IEEE International
Conference on Information Communication and Embedded Systems (ICICES),
M

2014, pp. 1-7.

[10] B. Luo, M. Murtaza Khan, T. Bienvenu, J. Chanussot, L. Zhang, Decision Based
Fusion for Pansharpening of Remote Sensing Images, IEEE Geoscience and Re-
ED

mote Sensing Letters. 10 (1) (2013) 19-23.

[11] Z. Yunfeng, Y. Yixin, F. Dongmei, "Decision-level fusion of infrared and visible
images for face recognition," in Proc. Control and Decision Conference (CCDC),
pp. 2411-2414, pp. 2411-2414, 2008.
PT

[12] M. Fauvel , J. Chanussot , J. Atli Benediktsson , Decision Fusion for the Classifi-
cation of Urban Remote Sensing Images, IEEE Transactions on Geoscience and
Remote Sensing. 44 (10) (2006) 2828-2838.
CE

[13] Y. Liu, X. Chen, H. Peng, Z. Wang, Multi-focus image fusion with a deep
convolutional neural network. Information Fusion. 36(1) (2017) 191-207.
[14] A. Dogra, B. Goyal, S. Agrawal, From Multi-Scale Decomposition to Non-Multi-
AC

Scale Decomposition Methods: A Comprehensive Survey of Image Fusion

Techniques and Its Applications, IEEE Access. 5 (2017)16040-67.
[15] W. Kong, Y. Lei, X. Ni, Fusion technique for grey-scale visible light and infrared
images based on non-subsampled contourlet transform and intensity-hue-saturation
transform. IET signal processing. 5(1) (2011) 75-80.
[16] H. Zhou, H. Gao, Fusion method for remote sensing image based on fuzzy integral,
Journal of Electrical and Computer Engineering. 2014, 26.
[17] Z. Zhang, R.S. Blum, A categorization of multiscale-decomposition-based image
fusion schemes with a performance study for a digital camera application,
Proceedings of the IEEE 87(8) (1999) 1315-1326.
[18] V. Solanky, S.K. Katiyar, Pixel-level image fusion techniques in remote sensing: a

28
ACCEPTED MANUSCRIPT
review, Spat. Inf. Res. 24 (4) (2016) 475-483.
[19] H. Ghassemian, A review of remote sensing image fusion methods, Information
Fusion. 32 (2016) 75-89.
[20] G. Vivone, L. Alparone, J. Chanussot, M.D. Mura, A. Garzelli, S. Member, G.A.
Licciardi, R. Restaino, L. Wald, A Critical Comparison Among Pansharpening
Algorithms, IEEE Transaction on Geoscience and Remote Sensing. 53 (5) (2014)
2565-2586.
[21] Z. Wang, S. Wang, Y. Zhu, Y. Ma, Review of Image Fusion Based on Pulse-
Coupled Neural Network, Arch. Comput. Methods Eng. 23 (4) (2015) 659-671.
[22] J. Du, W. Li, K. Lu, B. Xiao, An overview of multi-modal medical image fusion,
Neurocomputing. 215 (2016) 3-20.
[23] A.P. James, B.V. Dasarathy, Medical image fusion: A survey of the state of the
art, Information Fusion. 19 (2014) 4-19.

T
[24] Z. Zhang, R. Blum, Region-based image fusion scheme for concealed weapon

IP
detection, in: Proceedings of the 31st Annual Conference on Information Sciences
and Systems, 1997, pp. 168-173.

CR
[25] B. Matuszewski, L.K. Shark, M. Varley, Region-based wavelet fusion of ultrasonic,
radiographic and shearographic non-destructive testing images, in: Proceedings of
the 15th World Conference on Non- Destructive Testing, 2000, pp. 15-21.
[26] G. Piella, A region-based multiresolution image fusion algorithm, in: Proceedings of

[27] US
IEEE International Conference on Information Fusion, Vol. 2, 2002, pp. 1557-1564.
G. Piella, A general framework for multiresolution image fusion: from pixels to
regions. Information fusion. 4(4) (2003) 259-280.
AN
[28] J.J. Lewis, R.J.O.Õ. Callaghan, S.G. Nikolov, D.R. Bull, Pixel- and region-based
image fusion with complex wavelets, information fusion. 8(2) (2007) 119-130.
[29] T. Zaveri, M. Zaveri, A novel region based multimodality image fusion method,
Journal of Pattern Recognition Research. 2 (2011)140-53.
M

[30] B.L. Guo,Q.Zhang,Y.Hou,Region based fusion of infrared and visible images using
nonsubsampled contourlet transform,Chin.Opt.Lett. 6 (5) (2008) 338-341
[31] Z. Yu, L. Yan, N. Han, A Region-based Image Fusion Algorithm for Detecting
ED

Trees in Forests, Open Cybernetics & Systemics Journal. 8 (2014) 540-545.

[32] X. Luo, Z. Zhang, X. Wu, A novel algorithm of remote sensing image fusion based
on shift-invariant Shearlet transform and regional selection, AEU - International
PT

Journal of Electronics and Communications. 70(2) (2016) 186-197.

[33] L. Wang, J. Du, S. Zhu, D. Fan, J. Lee, New region-based image fusion scheme
using the discrete wavelet frame transform, in: Proceedings of IEEE International
CE

Conference on Intelligent Control and Automation (WCICA),2016 pp. 3066-3070.

[34] N. Mitianoudis, T. Stathaki, Pixel-based and region-based image fusion schemes
using ICA bases, Information Fusion. 8 (2007) 131-142.
[35] N. Cvejic, J. Lewis, D. Bull, N. Canagarajah, Adaptive region-based multimodal
AC

image fusion using ICA bases, In: Proceedings of IEEE International Conference
on Information Fusion. 2006, 10, pp. 1-6.
[36] N. Cvejic, D. Bull, N. Canagarajah, Region-based multimodal image fusion using
ICA bases, IEEE Sensors Journal. 7(5) (2007) 743-51.
[37] N. Mitianoudis, T. Stathaki, Optimal contrast correction for ICA-based fusion of
multimodal images, IEEE sensors journal. 8(12) (2008) 2016-26.
[38] N. Mitianoudis, S.A. Antonopoulos, T. Stathaki, Region-based ICA image fusion
using textural information, In: Proceedings of IEEE International Conference on
Digit. Signal Process. (DSP) 2013, pp. 1-6.
[39] Z. Omar, N. Mitianoudis, T. Stathaki, Region-based image fusion using a
combinatory Chebyshev-ICA method, in: Proceedings of IEEE International
29
ACCEPTED MANUSCRIPT
Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, pp. 1213-
1216.
[40] D.E. Nirmala, A.B.S. Paul, V. Vaidehi, Improving independent component analysis
using support vector machines for multimodal image fusion, Journal of computer
science. 9 (2013) 1117-1132.
[41] S. Agrawal, S. Swain, L. Dora, BFO-ICA based multi focus image fusion, In IEEE
Symposium on Swarm Intelligence (SIS), 2013, pp. 194-199.
[42] N. Cvejic, D. Bull, N. Canagarajah, A novel ICA domain multimodal image fusion
algorithm, In Proc. Of International Society for Optics and Photonics on Multi-
sensor, Multisource Information Fusion: Architectures, Algorithms, and
Applications Vol. 6242, 2006, p. 62420W.
[43] S.Li Hsu, P. Gau, Region-based Image Fusion with Artificial Neural Network, Int. J.
Inf.Math.Sci..5(2009)264-267.

T
[44] S. Li, J. T. Kwok, Y. Wang, Multifocus image fusion using artificial neural

IP
networks. Pattern Recognition Letters, 23(8) (2002) 985-997.
[45] M. Pagidimarry, K. A. Babu, An all approach for multi-focus image fusion using
neural network, Artificial Intelligent Systems and Machine Learning,3(12) (2011)

CR
732-739.
[46] M. Li, W. Cai, Z. Tan, A region-based image fusion scheme using pulse-coupled
neural network, Pattern Recognition Letters. 27(16) (2006)1948-56.
[47]

[48]
US
J. Saeedi, K. Faez, Infrared and visible image fusion using fuzzy logic and
population-based optimization, Applied Soft Computing. 12(3) (2012) 1041-1054.
V. Aslantas, E. Bendes, R. Kurban, A.N. Toprak, New optimised region-based
AN
multi-scale image fusion method for thermal and visible images, IET Image
Process.8(5) (2014) 289-299.
[49] Y. Zheng, Z. Qin, Region-based image fusion method using bidimensional
empirical mode decomposition, Journal of Electronic Imaging. 18 (1) (2009)
M

013008-013008.
[50] S. Li, B. Yang, Multifocus image fusion using region segmentation and spatial
frequency, Image Visual Computing. 26 (7) (2008) 971-979.
ED

[51] X. Luo, J. Zhang, Q. Dai, A regional image fusion based on similarity

characteristics, Signal Processing.92 (5) (2012) 1268-1280.
[52] Y. Chen, Z. Qin, Region-based image-fusion framework for compressive imaging,
Journal of Applied Mathematics. 2014 Oct 29;2014.
PT

[53] S. Li, X. Kang, J. Hu, B. Yang,Image matting for fusion of multi-focus images in
dynamic scenes,Information Fusion. 14 (2) (2013)147-162.
[54] L. Chen, J. Li, C.P. Chen, Regional multifocus image fusion using sparse
CE

representation, Optics express. 21(4) (2013) 5182-97.

[55] A. A. Li Shuang, L. Zhilin, A region-based technique for fusion of high resolution
images using mean shift segmentation, The International Archives of the
AC

Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII.

Part B7 (2008).
[56] R. K. Sharma, T.K. Leen, M. Pavel, Probabilistic image sensor fusion, In Advances
in Neural Information Processing Systems. (1999) 824-830.
[57] J. Yang, R.S. Blum, A statistical signal processing approach to image fusion for
concealed weapon detection, in: Proceedings of IEEE International Conference on
Image Processing, Vol 1, 2002, pp. I-I.
[58] J. Yang, R.S. Blum, A region-based image fusion method using the expectation-
maximization algorithm, in: Proceedings of IEEE Conference on Information
Sciences and Systems, 2006.
[59] X. B. Jin, Q. L. Zhang, EM image fusion algorithm based on statistical signal

30
ACCEPTED MANUSCRIPT
processing, In IEEE 2nd International Congress on Image and Signal Processing,
2009, pp. 1-4.
[60] J. Yang, R.S Blum, Image fusion using the expectation-maximization algorithm and
a hidden Markov model in: Proceedings of IEEE Conference on Vehicular
Technology, Vol. 6, 2004, pp. 4563-4567.
[61] Y. Zhang., L. Ge, Region-based image fusion using energy estimation, in:
Proceedings of IEEE International Conference on Software Engineering, Artificial
Intelligence, Networking, and Parallel/Distributed Computing, Vol. 1,2007, pp. 729-
734.
[62] Y. Zhang, Adaptive region-based image fusion using energy evaluation model for
fusion decision, Signal, Image and Video Processing. 1(3) (2007)215-23.
[63] T. Wan, N. Canagarajah, A. Achim, Segmentation-driven image fusion based on
alpha-stable modeling of wavelet coefficients, IEEE Transactions on

T
Multimedia.11(4) (2009)624-33.

IP
[64] M. Zribi, Non-parametric and region-based image fusion with Bootstrap sampling,
Information Fusion. 11(2) (2010) 85-94.
[65] L. Itti, C. Koch, E.Niebur,A model of saliency-based visual attention for rapid scene

CR
analysis, IEEE Transactions on pattern analysis and machine intelligence. 20(11)
(1998) 1254-1259.
[66] J. Harel, C. Koch, P. Perona, Graph-based visual saliency, In Advances in neural

[67] US
information processing systems, 2007, pp. 545-552.
R. Achanta, S. Hemami, F. Estrada, S. Susstrunk, Frequency-tuned salient region
detection, in: Proceedings of IEEE Conference on Comput. Vis. Pattern
AN
Recognition, 2009, pp. 1597-1604.
[68] H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, S. Li, Salient object detection: A
discriminative regional feature integration approach, in: Proceedings of the IEEE
conference on computer vision and pattern recognition, 2013, pp. 2083-2090.
M

[69] J. Zhang, S. Sclaroff, Saliency detection: A boolean map approach, in: Proceedings
of the IEEE international conference on computer vision ,2013, pp. 153-160.
[70] Y. Chai, H. Li, Z. Li, Multi-focus image fusion scheme using focused region
ED

detection and multiresolution, Optics Communications. 284(19) (2011)4376-89.

[71] H. Li, Y. Chai, Z. Li, Multi-focus image fusion based on nonsubsampled contourlet
transform and focused regions detection, Optik (Stuttg). 124 (2013) 40–51.
[72] Y. Liu, J. Jin, Q. Wang, Y. Shen, X. Dong, Region level based multifocus image
PT

fusion using quaternion wavelet and normalized cut, Signal Processing. 97 (2014) 9-
30.
[73] Y. Yang, S. Tong, S. Huang, P. Lin, Multi-focus image fusion based on NSCT and
CE

focused area detection, IEEE Sensors Journal. (5) (2015) 2824-38.

[74] M. Nejati, S. Samavi, N. Karimi, S.R. Soroushmehr, S. Shirani, I. Roosta, K.
Najarian, Surface area-based focus criterion for multi-focus image fusion,
AC

Information Fusion. 36 (2017) 284-95.

[75] B. Zhang, X. Lu, H. Pei, H. Liu, Y. Zhao, W. Zhou. Multifocus image fusion
algorithm based on focused region extraction, Neurocomputing. 174(2016) 733-748.
[76] F. Meng, M. Song, B. Guo, R. Shi, D. Shan,Image fusion based on object region
detection and Non-Subsampled Contourlet Transform, Computers and Electrical
Engineering. 0(2016)19.
[77] F. Meng, B. Guo, M. Song, X. Zhang, Image fusion with saliency map and interest
points,Neurocomputing. 177(2016) 1-8.
[78] J. Han, E. J. Pauwels, P. De Zeeuw, Fast saliency-aware multi-modality image
fusion, Neurocomputing. 111(2013)70-80.
[79] V.P.S. Naidu, Discrete Cosine Transform-based Image Fusion. Def. Sci. J. 60 (1)

31
ACCEPTED MANUSCRIPT
(2010) 48-54.
[80] L. F. Zoran, Quality evaluation of multiresolution remote sensing images fusion,
UPB Sci Bull Series C,71 (2009) 38-52.
[81] X. L. Zhang, Z.F. Liu, Y. Kou, J.B. Dai, Z.M. Cheng, Quality assessment of image
fusion based on image content and structural similarity, in: Proceedings of IEEE 2nd
International Conference on Information Engineering and Computer Science
(ICIECS), 2010, pp. 1-4.
[82] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, Image quality assessment:
From error visibility to structural similarity, IEEE Trans. Image Process. 13(4)
(2004) 600-612.
[83] X. X. Zhu, R. Bamler, A sparse image fusion algorithm with application to pan-
sharpening, IEEE transactions on geoscience and remote sensing.51(5) (2013) 2827-
36.

T
[84] Z. Wang, A. C. Bovik, A universal image quality index, IEEE signal processing

IP
letter,9 (3) (2002) 81-84.
[85] W. Wang, F. Chang, A multi-focus image fusion method based on laplacian
pyramid. JCP 6(12) (2011) 2559-66.

CR
[86] M.B.A. Haghighat, A. Aghagolzadeh, H. Seyedarabi, A non-reference image fusion
metric based on mutual information of image features, Computers & Electrical
Engineering.37 (5) (2011) 744-56.
[87]

[88]
US
S. Li, J. T. Kwok, Y. Wang, Combination of images with diverse focuses using the
spatial frequency, Information Fusion. 2(3) (2001)169-176.
V. Aslantas, E. Bendes, A new image quality metric for image fusion: The sum of
AN
the correlations of differences, AEU-International Journal of Electronics and
Communications. 69(12) (2015) 1890-1896.
[89] G. Piella, H. Heijmans, A new quality metric for image fusion, in: Proceedings of
IEEE International Conference on Image Processing, 2003, pp. 173-176.
M

[90] V. Petrovic, C. Xydeas, Objective evaluation of signal-level image fusion

performance,Opt. Eng., 44 (8) (2005) 087003-8.
[91] V. Aslantas, R. Kurban, Fusion of multi-focus images using differential evolution
ED

algorithm, Expert System.37 (12) (2010) 8861-8870.

[92] Z. Liu, Y. Chai, H. Yin, J. Zhou, Z. Zhu, A novel multi-focus image fusion
approach based on image decomposition, Information Fusion. 31(35) (2017) 102-
PT

116.
[93] Y. Liu, S. Liu, Z. Wang, Multi-focus image fusion with dense SIFT, Information
Fusion. 23 (1) (2015) 139-155.
CE

[94] Z. Xu, Medical image fusion using multi-level local extrema, Information Fusion.
19(2014)38-48.
[95] L. Wang, B. Li, L. Tian, Multi-modal medical image fusion using the inter-scale
andintra-scale dependencies between image shift-invariant shearlet coefficients,
AC

Information Fusion. 19 (2014) 20-28.

[96] Z. Liu, H. Yin, Y. Chai, et al, A novel approach for multimodal medical image
fusion, Expert Systems with Applications. 41 (16) (2014) 7425-7435.
[97] Z. Zhou, B. Wang, S. Li, M. Dong, Perceptual fusion of infrared and visible images
through a hybrid multi-scale decomposition with Gaussian and bilateral filters,
Information Fusion. 30 (1) (2016) 15-26.
[98] J. Ma, C. Chen, C. Li, J. Huang, Infrared and visible image fusion via gradient
transfer and total variation minimization, Information Fusion.31(2016)100-109.
[99] R. Panda, M. K. Naik, Fusion of infrared and visual images using bacterial foraging
strategy, WSEAS Trans. on Signal Processing. 8(4) (2012) 145-156.

32
ACCEPTED MANUSCRIPT
[100] G. Simone, A. Farina, F.C. Morabito, S. B. Serpico, L. Bruzzone, Image fusion
techniques for remote sensing applications, Information fusion. 3 (1) (2002) 3-15.
[101] J. Cheng, H. Liu, T. Liu, F. Wang, H. Li, Remote sensing image fusion via wavelet
transform and sparse representation, ISPRS Journal of Photogrammetry and Remote
Sensing. 104 (2015) 158-73.
[102] W. Wenbo, Y. Jing, K. Tingjun, Study of Remote Sensing Image Fusion and Its
Application in Image Classification, The International Archives of the
Photogrammetry, Remote Sensing and Spatial Information Sciences, 2008, pp.
1141-1146.

T
IP
CR
US
AN
M
ED
PT
CE
AC