Guo2022 Article MultimodalMedicalImageFusionWi
Guo2022 Article MultimodalMedicalImageFusionWi
https://doi.org/10.1007/s40747-022-00792-9
ORIGINAL ARTICLE
Abstract
Multimodal medical image is an effective method to solve a series of clinical problems, such as clinical diagnosis and
postoperative treatment. In this study, a medical image fusion method based on convolutional sparse representation (CSR)
and mutual information correlation is proposed. In this method, the source image is decomposed into one high-frequency
and one low-frequency sub-band by non-subsampled shearlet transform. For the high-frequency sub-band, CSR is used for
high-frequency coefficient fusion. For the low-frequency sub-band, different fusion strategies are used for different regions
by mutual information correlation analysis. Analysis of two kinds of medical image fusion problems, namely, CT–MRI and
MRI–SPECT, reveals that the performance of this method is robust in terms of five common objective metrics. Compared
with the other six advanced medical image fusion methods, the experimental results show that the proposed method achieves
better results in subjective vision and objective evaluation metrics.
Keywords Medical image fusion · NSST · Convolution sparse representation · Mutual information correlation
123
Complex & Intelligent Systems
Laplacian pyramid decomposition [9], and gradient pyra- time. To obtain better fusion effect, different fusion strategies
mid transformation [10], the method based on NSST can be are selected according to different sub-bands in this study,
decomposed from multiple directions, thus obtaining more namely, maximum fusion is used for the high-frequency
image details. Compared with wavelet methods, such as sub-band and details of the low-frequency sub-band, and
discrete wave and dual tree complex wavelet, the method weighted average fusion is used for the similar structural
based on NSST can represent the curve and edge details information in the low-frequency sub-band.
of image well. Compared with multiscale geometric trans- This study focuses on the determination of decomposition
formations, such as contourlet transform (COT) [11] and scale and the fusion strategy of different frequency bands in
shearlet transform (ST) [12], the method based on NSST does NSST decomposition. To avoid the influence of high noise
not produce pseudo Gibbs phenomenon due to frequency and registration on the fusion of high-frequency sub-band
aliasing. However, most of the existing NSST decomposi- when the NSST decomposition scale is too high, this study
tion methods have a higher decomposition levels, which not only carries out one-level decomposition of NSST, that is,
only increases the amount of calculation, but also makes one high-frequency sub-band and one low-frequency sub-
high-frequency sub-bands susceptible to noise. To preserve band. How to use mutual information correlation analysis
the structural information of the image as much as possi- to mine the detailed information in the low-frequency sub-
ble and to extract additional salient details, a new multiscale band is one of the research objectives of this study. It has
decomposition method is proposed in this study. Unaffected been explained above that it is inappropriate for all sub-bands
by the scale parameters of the general multiscale decom- to adopt the same fusion strategy. Another objective of this
position, this method uses NSST to decompose the image study is to study which fusion strategy should be adopted for
only in two scales, namely, one high-frequency sub-band and high-frequency sub-band, and similar and dissimilar regions
one low-frequency sub-band. In addition to using convolu- of low-frequency sub-band.
tion sparse representation to enhance the detailed information The main innovation points of this study include the fol-
of high-frequency sub-band, correlation analysis should be lowing three aspects:
used to extract the detailed information of low-frequency
sub-band due to the rich detailed information contained in
low-frequency sub-band. Sparse representation seeks to rep- 1. The convolutional sparse representation (CSR) model
resent image features with as few sparse vectors as possible, is used to process the high-frequency sub-band, which
which is widely used in image reconstruction and denoising. increases the detailed features and reduces the block
The improvement of convolution sparse representation is that effect caused by NSST decomposition, as well as the
the sparse coefficients of local image blocks are replaced by redundant information of different source graphs.
global sparse coefficients. 2. Mutual information correlation is used to extract detail
The fusion strategy is important for the quality of fused information of low-frequency sub-bands. Given that
image. In multiscale decomposition, a common strategy is only two-scale decomposition is conducted, the low-
to measure the activity of the decomposition coefficients frequency sub-band contains abundant details. The
first and then fuse them in accordance with the mean or mutual information correlation analysis can find the
maximum value. For example, in [13, 14], high- and the low- regions containing detailed information from the low-
frequency sub-bands adopt the maximum scheme for fusion. frequency sub-band.
However, low-frequency sub-bands provides structure infor- 3. Two different fusion strategies are used for low-
mation similar to the source image, whereas high-frequency frequency sub-band. The structural information of simi-
sub-bands contain important details, thus, the same fusion larity is fused using the weighted average scheme, where
scheme cannot consider the similarity and importance of the the weight takes the product of the correlation analysis
image simultaneously. In [15], a weighted average fusion coefficient and the regional energy sum. The Laplacian
strategy is adopted for similar regions of images, in this energy gradient was used to measure the activity of the
strategy, weight is calculated using the Siamese network. dissimilar regions to reflect the contrast changes of the
However, the definition of similar regions by this method regions.
directly affects the effect of the final image fusion. Recently,
principal component analysis (PCA) [16], sparse representa-
tion [17, 18], smallest univalue segment assimilating nucleus The remaining sections of this paper are organized as fol-
(SUSAN) [19], and pulse coupled neural network (PCNN) lows: the next section describes related work about NSST and
[20, 21] have been used to enhance the salient information of CSR. The following section explains the methods in detail.
fused images and measure the activity of decomposition coef- In the next section, a comparative experiment is simulated,
ficients. However, these methods have their own problems and the corresponding results are analyzed. The last section
either in the selection of sparse dictionaries or in the training summarizes the study.
123
Complex & Intelligent Systems
is defined as
123
Complex & Intelligent Systems
decomposition, low-frequency and high-frequency sub-band the source image. Thus, the sparse coefficient fusion rule of
fusion, and NSST reconstruction. For simplicity, two source high-frequency sub-band is defined as
diagrams are used for illustration. First, NSST is applied F
X m, 1 :N (x, y)
to source image Ia and Ib . After first-level decomposition,
A B
a high-frequency sub-band with significant details and a
A
X m, (x, y) if X (x, y) ≥ X (x, y)
1:N m, 1:N 1 m, 1:N 1 .
low-frequency sub-band with structure information can be X m, 1:N (x, y) otherwise
B
123
Complex & Intelligent Systems
maps can be preserved. Mutual information is often used in indicates the weight of the local pixel. Given that the low-
multimodal image registration, which is a statistical corre- frequency sub-band image is relatively smooth, the weight
lation method based on gray value. The greater the mutual can be directly represented in 22N −d , where d is the dis-
information between two images, the higher the correlation tance from the field pixel to the center point, if N 1, then
between the two images. The mutual information quantity of normalized W L is defined as
an image can be calculated by Kullback–Leibler Divergence,
and the mathematical form is as follows: ⎡ ⎤
121
P(x, y) 1⎢ ⎥
MI(x, y) P(x, y) log , (5) WL ⎣ 2 4 2 ⎦.
x y
P(x)P(y) 16
121
H (x) − P(x) log2 P(x), (6)
x
For the low-frequency sub-band with high correlation, the
H (y) − P(y) log2 P(y). (7) coefficient blocks are fused by the strategy of the weighted
y sum of energy of the center pixel and are defined as
N
N
E m (x, y) W L (i + N + 1, j + N + 1) L A ifNEG(x, y)1 ≥ NEG(x, y)2
LF . (13)
i−N j−N L B otherwise
× L m (x + i, y + j)2 . (9)
Here, N is the radius of the local region (2N + 1, 2N + 1), The detailed description of the algorithm is shown in Algo-
L m is the low-frequency coefficients of m image, and W L rithm 1.
123
Complex & Intelligent Systems
Algorithm 1 Proposed Medical Image Fusion Algorithm NSCT decomposition in the PC–LLE method, the high-
Input: source images: A and B. frequency sub-bands are fused by the phase consistency rule.
The image of Interest–Laplacian filter (IOI–LLF) method
Part 1: NSST decomposition
uses local LLF to decompose the source image into resid-
01: For each source image S = [A, B];
ual and Ground images and further decomposes the residual
02: Perform NSST decomposition on S to obtain {H , L }; image based on IOI. The CNN–CP method uses a trained
03: End; Siamese convolution network to fuse the pixel activity infor-
Part 2: Fusion of high-frequency sub-band mation of the source image and generate a weight map. The
04: For each source image S = [A, B]; first two methods are based on SR, whereas the last four
05: Calculate the convolution sparse coefficient by Eq. (2); methods are all multiscale decomposition methods.
06: End;
07: Convolution sparse coefficient fusion by Eq. (3); Objective evaluation metrics
08: The fused high-frequency sub-band are obtained by
Eq. (4); To evaluate the performance of the various methods, five
Part 3: Fusion of low-frequency sub-band
widely recognized object metrics, namely, entropy (EN) [39],
structural similarity (Q e ) [40], mutual information (MI) [41],
09: For each 3*3 sliding window AB
gradient (Q /F ) [42] and the human eye visual perception
10: Calculate the correlation of low-frequency sub-band
(VIF) [43], are used in the experiment. EN can reflect the
by the mutual information Eq. (5);
amount of information contained in the fused image; Q e rep-
11: The low-frequency sub-band is divided into two resents the degree of similarity between the fused and source
different regions according to the correlation threshold T; images; MI is a mutual information indicator used to mea-
AB
12: if NMI > T, then fusion is performed by Eq. (10); sure the information contained in the fused image; Q /F is
13: if NMI < T, then fusion is performed by Eq. (13); a quality metric based on gradient, which is mainly used to
14: End; measure the edge information of fused images; VIF is the
Part 4: NSST reconstruction information ratio between the fused image and the source
15: Perform inverse NSST on {H , L } to obtain F; image and is used to evaluate the human visualization per-
formance of the fused image.
Output: the fused image F.
Experimental settings
123
Complex & Intelligent Systems
Figure 5 shows the results of two sets of MRI–SPECT To compared the computational costs of different fusion
images obtained by different fusion methods. Among them, methods, the total time of 10 groups of CT–MRI fusion
LP–ASR and SR–NSCT lose part of MRI information, and images is first calculated and then divided by 10 to obtain
local image distortion exists. The IOI–LLF, PC–LLE and the average running time. The calculation was repeated 10
CNN–CP methods have complete texture information, but times, and the results and standard deviations are shown in
some SPECT functional information is missing. The fusion Table 3. The proposed method is inferior to the LP–ASR
effect of PA–PCNN, PCNN–NSST and the proposed method and PC–LLE methods but superior to the other four meth-
is better subjectively. ods. Particularly, the performance of the proposed method
To evaluate the performance of each fusion method objec- is similar to that of PA–PCNN, but the computational effi-
tively, Tables 1 and 2, respectively, show the average scores ciency is higher, because the iteration process of PCNN is
of the CT–MRI and MRI–SPCET fusion results. The higher time consuming. Here, the IOI–LLF method has the lowest
the index value, the better the fusion performance, where the calculation efficiency, because IOI takes too much time.
highest score is indicated in bold, and the lowest score is The proposed algorithm’s performance was also evaluated
indicated by subscript. In addition, the performance of the by changing the value of parameters used in the proposed
proposed method is compared with that of several recent method, such as NSST decomposition level, and the direc-
NSST methods. Among them, COF–MLE–NSST method tions number. These values are obtained over 20 pairs of
uses co-occurrence filter to measure the activity of low- multi-modality medical images, and the average outcomes
frequency sub-band coefficient, PSO–NSST method uses are shown in Table 4. From Table 4, it can be analyzed that as
particle swarm optimization algorithm to optimize the mem- the decomposition level and directions are increasing, the val-
AB
bership function of low-frequency sub-band fuzzy logic ues of En and MI are also increased. The values of Q e , Q /F
system, and PCNN–NSST method uses PCNN to fuse high- and VIF are optimal when Level 3. In general, with the
frequency sub-band. Compared with the other nine methods, increase of level, the value of each metrics increases slightly.
AB
the proposed method ranks first in Q e , MI, and Q /F for
CT–MRI and MRI–SPECT images, indicating that it pre-
serves most of the structure information in the source images
and keeps the edge of the source image and structure well. At Conclusions
the same time, because the method based on transformation
domain is accompanied by the loss of a certain amount of In this study, we propose a multimodal medical image fusion
information, the ranking of EN and VIF is not the highest, method based on NSST and mutual information correlation
but the ranking is still relatively high, indicating that the pro- analysis. Based on NSST scale decomposition, this method
posed method has good robustness. The proposed method is uses CSR to enhance the high-frequency detail informa-
inferior to the PA–PCNN method in VIF, because the latter tion and uses mutual information correlation to mine the
adopts a neuron perceptron similar to that of humans. detail information of low-frequency sub-band. Then, dif-
ferent fusion strategies are adopted for different areas of
123
Complex & Intelligent Systems
123
Complex & Intelligent Systems
low-frequency sub-band according to correlation. To achieve following limitations: first, the setting of the threshold of the
this goal, two new activity level measurement methods based low-frequency sub-band correlation analysis has a certain
on the domain energy gradient and central pixel energy sum influence on the final fusion effect. If the threshold is set too
are designed. By comparing with other advanced methods small, then the extraction of detail information is insufficient;
and numerous experiments, the effectiveness of the pro- if the threshold is set too large, then meaningless details in
posed method is proven. However, the method still has the the MRI image are introduced into the fused image, causing
123
Complex & Intelligent Systems
123
Complex & Intelligent Systems
artifacts. In this study, the mutual information of the whole 10. Chen G et al (2019) Weighted sparse representation and gradient
source image is used as the threshold value for the corre- domain guided filter pyramid image fusion based on low-light-level
dual-channel camera. IEEE Photon J 99:1
lation analysis of low-frequency sub-bands; this strategy is 11. Li GX, Wang K (2007) Color image fusion algorithm using the
not an optimal scheme. In addition, Table 3 shows that this contourlet transform. Acta Electron Sin 35:112
method is not as fast as some fusion methods, because the 12. Miao QG, Cheng S, Xu PF et al (2011) A novel algorithm of image
local mutual information correlation is calculated by sliding fusion using shearlets. Opt Commun 284(6):1540–1547
13. Yin M, Liu X, Liu Y et al (2018) Medical image fusion
window, resulting in low calculation efficiency. In the future, with parameter-adaptive pulse coupled neural network in non-
we will devote ourselves to the research of a more effec- subsampled shearlet transform domain. IEEE Trans Instrum Meas
tive threshold determination scheme by combining the prior 68(1):49–64
information of source images. 14. Zhu Z, Zheng M, Qi G et al (2019) A phase congruency and
local laplacian energy based multi-modality medical image fusion
method in NSCT domain. IEEE Access 7:20811–20824
Acknowledgements This project is supported by the Provincial Natu-
15. Wang K, Zheng M, Wei H et al (2020) Multi-modality medical
ral Science Foundation of Hunan, China (Grant No. 2020JJ6021), the
image fusion using convolutional neural network and contrast pyra-
Research Foundation of Education Bureau of Hunan Province, China
mid. Sensors 20(8):2169
(Grant No. 21A0451, No. 19C0483), Construct Program of the Key
16. Shahdoosti HR, Ghassemian H (2016) Combining the spectral PCA
Discipline in Hunan Province: Control Science and Engineering.
and spatial PCA fusion methods by an optimal filter. Inf Fusion
27:150–160
Open Access This article is licensed under a Creative Commons
17. Liu Y, Liu S, Wang Z (2015) A general framework for image
Attribution 4.0 International License, which permits use, sharing, adap-
fusion based on multi-scale transform and sparse representation.
tation, distribution and reproduction in any medium or format, as
Inf Fusion 24:147–164
long as you give appropriate credit to the original author(s) and the
18. Wang K, Qi G, Zhu Z, Chai Y (2017) A novel geometric dictio-
source, provide a link to the Creative Commons licence, and indi-
nary construction approach for sparse representation based image
cate if changes were made. The images or other third party material
fusion. Entropy 19:306
in this article are included in the article’s Creative Commons licence,
19. Garaigordobil A, Ansola R, Veguería E et al (2019) Overhang
unless indicated otherwise in a credit line to the material. If material
constraint for topology optimization of self-supported compliant
is not included in the article’s Creative Commons licence and your
mechanisms considering additive manufacturing. Comput Aided
intended use is not permitted by statutory regulation or exceeds the
Design 109:33–48
permitted use, you will need to obtain permission directly from the copy-
20. Subashini MM, Sahoo SK (2014) Pulse coupled neural networks
right holder. To view a copy of this licence, visit http://creativecomm
and its applications. Expert Syst Appl 41(8):3965–3974
ons.org/licenses/by/4.0/.
21. Wang M, Shang X (2020) An improved simplified PCNN model
for salient region detection. Vis Comput 10–12:1–13
22. Easley G, Labate D, Lim W-Q (2008) Sparse directional image
representations using the discrete shearlet transform. Appl Comput
Harmon Anal 25(1):25–46
References 23. Kim M, Han DK, Ko H (2016) Joint patch clustering-based dictio-
nary learning for multimodal image fusion. Inf Fusion 27:198–214
1. Ganasala P, Kumar V (2016) Feature-motivated simplified adaptive 24. Liu H, Liu Y, Sun F (2015) Robust exemplar extraction using struc-
PCNN-based medical image fusion algorithm in NSST domain. J tured sparse coding. IEEE Trans Neural Netw 26:1816–1821
Digit Imaging 29:73–85 25. Yang J, Wright J, Huang TS, Ma Y (2010) Image super-
2. Qi G, Wang J, Zhang Q, Zeng F, Zhu Z (2017) An integrated dic- resolution via sparse representation. IEEE Trans Image Process
tionary learning entropy-based medical image fusion framework. 19(11):2861–2873
FutureInternet 9(4):61 26. Dong W et al (2011) Image deblurring and super-resolution by
3. Petrovic V, Xydeas C (2004) Gradient-based multiresolution image adaptive sparse domain selection and adaptive regularization. IEEE
fusion. IEEE Trans Image Process 13:228–237 Trans Image Process 20(7):1838–1857
4. Sundar K, Jahnavi M, Lakshmisaritha K (2017) Multi-sensor image 27. Yang B, Li S (2010) Multifocus image fusion and restoration with
fusion based on empirical wavelet transform. In: 2017 international sparse representation. IEEE Trans Instrum Meas 59(4):884–892
conference on electrical, electronics, communication, computer, 28. Yin H, Li S, Fang L (2013) Simultaneous image fusion and super-
and optimization techniques (ICEECCOT). IEEE, pp 93–97 resolution using sparserepresentation. Inf Fusion 14:229–240
5. Liu Y, Liu S, Wang Z (2015) Multi-focus image fusion with dense 29. Wohlberg B (2015) Efficient algorithms for convolutional sparse
SIFT. Inf Fusion 23:139–155 representations. IEEE Trans Image Process 25(1):301–315
6. Xia J, Lu Y, Tan L et al (2021) intelligent fusion of infrared and 30. Liu Y, Chen X et al (2019) Medical image fusion via convolutional
visible image data based on convolutional sparse representation and sparsity based morphological component analysis. IEEE Signal
improved pulse-coupled neural network. Comput Mater Continua Process Lett 26(3):485–489
67(1):613–624 31. Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm
7. Yuan G, Ma S, Liu J et al (2021) Fusion of medical images based on for designing overcomplete dictionaries for sparse representation.
salient features extraction by PSO optimized fuzzy logic in NSST IEEE Trans Signal Process 54:4311–4322
domain. Biomed Signal Process Control 69(12):102852 32. Dong W et al (2013) Sparse representation based image interpo-
8. Ouerghi H, Mourali O, Zagrouba E (2020) Multi-modal image lation with nonlocal autoregressive modeling. IEEE Trans Image
fusion based on weight local features and novel sum-modified- Process 22(4):1382–1394
Laplacian in non-subsampled Shearlet transform domain. In: 33. Wang Z, Cuia Z, Zhu Y (2020) Multi-modal medical image fusion
International symposium on visual computing by Laplacian pyramid and adaptive sparse representation. Comput
9. Shen J, Zhao Y, Yan S, Li X (2014) Exposure fusion using boosting Biol Med 123:103823
Laplacian pyramid. IEEE Trans Cybern 44:1579–1590
123
Complex & Intelligent Systems
34. Li Y, Sun Y, Huang X et al (2018) An image fusion method based 42. Petrović V (2007) Subjective tests for image fusion evaluation and
on sparse representation and sum modified-Laplacian in NSCT objective metric validation. Inf Fusion 8:208–216
domain. Entropy 20(7):522 43. Bovik HA (2006) Image information and visual quality. IEEE Trans
35. Jiao D, Li W, Xiao B (2017) Anatomical-functional image fusion Image Process 15(2):430
by information of interest in local Laplacian filtering domain. IEEE
Trans Image Process 12:1–1
36. Diwakar M, Singh P, Shankar A (2021) Multi-modal medical image
Publisher’s Note Springer Nature remains neutral with regard to juris-
fusion framework using co-occurrence filter and local extrema in
dictional claims in published maps and institutional affiliations.
NSST domain. Biomed Signal Process Control 68(12):102788
37. Yuan GA et al (2021) Fusion of medical images based on salient
features extraction by PSO optimized fuzzy logic in NSST domain.
Biomedical Signal Process Control 69:102852
38. Wei T et al (2020) Multimodal medical image fusion algorithm in
the era of big data. Neural Comput Appl 3:1–21
39. Cvejic N, Canagarajah C, Bull D (2006) Image fusion metric
based on mutual information and Tsallis entropy. Electron Lett
42:626–627
40. Zhang X-L, Li X-F, Li J (2014) Validation and correlation analysis
of metrics for evaluating performance of image fusion. Acta Autom
Sin 40(2):306–315
41. Qu G, Zhang D, Yan P (2002) Information measure for performance
of image fusion. Electron Lett 38:313–315
123