Image Fusion With Cosparse Analysis Operator
Abstract
The paper addresses the image fusion problem, where multiple images captured with different focus distances are to be combined into a higher quality allinfocus image. Most current approaches for image fusion strongly rely on the unrealistic noisefree assumption used during the image acquisition, and then yield limited robustness in fusion processing. In our approach, we formulate the multifocus image fusion problem in terms of an analysis sparse model, and simultaneously perform the restoration and fusion of multifocus images. Based on this model, we propose an analysis operator learning, and define a novel fusion function to generate an allinfocus image. Experimental evaluations confirm the effectiveness of the proposed fusion approach both visually and quantitatively, and show that our approach outperforms stateoftheart fusion methods.
Shell : Bare Demo of IEEEtran.cls for Journals
1Introduction
Fusion of multifocus images is a popular approach for generating an allinfocus image with less artifacts and higher quality [1]. It relies on the idea of combining a captured sequence of multifocus images with different focal settings. It plays a crucial role in many fundamental fields such as machine vision, remote sensing and medical imaging [3]. During the last decade, two types of fusion approaches have been developed: the transform domainbased approaches and the spatial frequencybased approaches. Most existing transform domainbased methods [6] work with a limited basis, and make fusion excessively dependent on the choice of a basis. The latter approaches [10] require highly accurate subpixel or subregion estimates, and thus, fail to perform well in elimination of undesirable artifacts.
A prevalent approach for image fusion is based on synthesis sparse model. Various manifold fusion methods have been proposed to explore this model [3]. The core ideas here are to describe source images as linear combination of a few columns from a prespecified dictionary, merge sparse coefficients by a fusion function, and then generate an allinfocus image using reconstructed sparse coefficients. While there has been extensive research on synthesis sparse model, the analysis sparse model [?] is a recent construction that stands as a powerful alternative. This new model represents a signal by multiplying it with the socalled analysis operator , and it emphasizes zero elements of the resulting analyzed vector which describes the subspace containing the signal . It promotes strong linear dependencies between rows of , leads to much richer unionofsubspaces, and shows the promise to be superior in various applications [23].
In this letter, we develop a novel fusion algorithm based on an analysis sparse model which allows for simultaneous restoration and fusion of multifocus images. Specifically, we formulate the fusion problem as a regularized inverseproblem of estimating an allinfocus image given its reconstructed form, and take advantages of the correlations among multiple captured images for fusion using a cosparsity prior. The corresponding algorithm exploits a combination of variable splitting and alternating direction method of multipliers (ADMM) to learn the analysis operator , that promotes cosparsity. Furthermore, a fusion function generates the cosparse representation of an allinfocus image. Advantages of this approach are: more flexible cosparse representation compared to the synthesis sparse approaches and better image restoration and fusion performance. The proposed approach also widens the applicability of analysis sparse model.
2Problem Formulation
Let be a sequence of multifocus images of the same scene acquired with different focal parameters. Our goal is to recover an allinfocus image from captured images . The model for describing the relationship between the sequence of multifocus images and an unknown allinfocus image can be formally expressed as
where is a blurring operator that denotes the physical process of capturing th multifocus image [26], and is an additive zeromean white Gaussian noise matrix, with entries drawn at random from the normal distribution . The blurring operator in most cases is unknown and irreversible, therefore it is too complex to find a sequence out of all possible operators. Instead, it is more favorable to seek a compromise between the physical modelling of image capture and signal approximation.
Assuming the cosparsity prior, each image patch of the image is said to have a cosparse representation over known analysis operator with , if there exits a sparse analyzed vector . The model emphasizes on zero coefficients of and defines as a submatrix of with rows that belong to the cosupport set . The cosupport set consists of the row indices, which determine the subspace that is orthogonal to. Then, can be characterized by its cosupport. Every image patch is estimated by solving the following optimization problem
where is a denoised estimation of , is a tolerance error, and and are respectively the norm and norm of a vector. Since is unknown, we choose to use the cosparse representation vector for recovering through optimal fusion of cosparse coefficients . Using correlations among multiple images, the proposed approach defines fusion function as that generates an optimal cosparse representation and returns the corresponding indices of image patches. Thus, a natural generalization of the problem for recovering a clean allinfocus image patch is given as
The role of in is to provide a meaningful constraint on how closely the optimal patch approximates . We replace the cosparse representation by the corresponding optimal of the input image , with respect to the cosparse analysis operator . Therefore, the major subproblems here are learning the analysis operator and the fusion function .
3Robust Fusion via Analysis Sparse Model
Analysis operator learning aims at constructing an operator suitable for a family of signals of interest. We first propose a practical approach for learning the analysis operator by variable splitting and ADMM. Then, with the analysis operator fixed, we define the optimal fusion function.
3.1Analysis Operator Learning
Suppose a training set is formed from a set of clean vectorized images contaminated by an additive zeromean white Gaussian noise . Our task is to find which enforces the coefficient vector to be sparse for each . This problem for can be cast as
where is a regularization parameter. To prevent from being degenerate, it is common to constrain its rows so that they have their norms equal to one. Then, the constraint set can be described as
where is the index set of the rows in corresponding to zero elements in , denotes the rank of a matrix, and elements of are zeros.
The problem  is nonconvex with respect to variables . A fundamental approach to addressing it is to alternate between the two sets of variables and , i.e., minimizing over one while keeping the other fixed. Motivated by the firstorder surrogate (FOS) approach [28] and ADMM [29], we propose the FOSADMM algorithm for cosparse coding.
With fixed, we update each column of . For notation simplicity, we drop the column index in and . We observe that the objective function for fixed , i.e., is the sum of two functions: and where . It enables us to transform the problem of minimizing into the following constraint optimization problem
The iterative algorithm to solve can be then expressed in the following ADMM [29] form
where is the augmented Lagrangian penalty and is the vector of Lagrange multipliers at th iteration. Note that the updates for and are separated into and . The subproblem is a convex quadratic problem, and it can be easily solved by the FOS approach [28] that consists of solving iteratively the optimization problem
where +, is the gradient of , is the Lipschitz constant of , and stands for the transpose.
As for the subproblem of updating , it turns out to be a simple shrinkage problem. Thus, we just employ the softthresholding operator , and the update becomes
where is the sign function and stands for the componentwise product.
With fixed, we turn to updating that amounts to obtaining each row of . The update of should be affected only by those columns of that are orthogonal to it [23]. Denote as indices of those columns, then the corresponding optimization problem can be written as
where , form the submatrices of and which containing columns found to be orthogonal to , respectively.
The problem leads to updates of the cosupport sets in each iteration. Motivated by [23], we simplify the updates in , and use the following approximation
as as an alternative which can be solved using the singular value decomposition (SVD) on the submatrix of formed by .
3.2Local Optimal Fusion
When the analysis operator is learned, we yet can not directly compute the cosparse representation of . Instead, we work with the collection of the cosparse representations , and then seek the optimal one to recover the corresponding allinfocus patch.
Using the sliding window technique, each image can be divided into small patches, from lefttop to rightbottom. For convenience, we introduce a matrix to extract the th block from the image . Visible artifacts may occur on block boundaries, and we also introduce overlapping patch of length for each small patch, and demand that the reconstructed allinfocus patches would agree to each other on the overlapping areas. According to [16], the block, or equivalently, the set of indices corresponding to the biggest value in the set is chosen to reconstruct the fused image. Thus, the problem of finding the optimal cosparse representation can be formulated as
Given the optimal fusion function , the fusion problem can be cast as the basis pursuit problem with the cosparse regularization term . Thus, the problem can be replaced by the following problem of finding the initial estimate of the fused patch
Since the problem is convex, its solution can be efficiently computed by using many existing algorithms [20].
3.3Global Reconstruction
The above explained local optimal fusion is used to recover local details for each allinfocus patch, respecting spatial compatibility between neighbouring patches. In order to remove possible artifacts and improve spatial smoothness, the global reconstruction constraint between the initial image estimate , formed from all ’s, and the final estimate can be applied to make a further improvement.
The size of is suitable to represent a small image patch, and it is too small to apply for the entire image. Therefore, we expand the size of and define
as the global analysis operator where and are indices of the boundaries. Using the result from the local optimal fusion, the entire image can be redefined using the reconstruction constraint by solving the problem
where is the parameter controlling the sparsity penalty and representation fidelity. Hence, the entire process of the optimal fusion is summarized in Algorithm ?.
4Experimental Results
We verify the restoration and fusion performance of the proposed approach by visual comparisons, and then discuss the quantitative assessments. We have tested our approach for a number of images, and here one representative example is shown. Specifically, fusion experiments over the standard multifocus dataset [31] are conducted. Throughout all the experiments, the tolerance error in the proposed approach is set as , the maximum number of iterations is , the patch size is , and the overlapping length is . During the analysis operator learning, the generated training set consists of twodimensional normalized samples of size extracted at random from the natural images. Considering the tradeoff between fusion quality and computations, the analysis operator size is fixed to . All the experiments are performed on a PC running Inter(R) Xeon(R) 3.40GHz CPU.
(a)
(b)
(c)
(d)
(e)
(a)
(b)
(c)
(d)
(e)
(a)
(b)
(a)
(b)
In the noisefree () and noisy () cases, the proposed approach is compared with wellknown fusion approaches, including the image fusion approach based on spatial frequency in discrete cosine transform (SFDCT) [5] and the sparse representation KSVDbased image fusion approach (SRKSVD) [16]. The fusion results of the noisefree images “Dog” are shown in Figure 5, including the magnified details in the lower right corners of the images. There are noticeable differences in the edge of the wall. The SFDCT method (see Figure 5(c)) produces blocking artifacts, and the SRKSVD method (see Figure 5(d)) introduces undesired smoothing. Our proposed method (see Figure 5(e)) eliminates some artificial distortions, and gives better visual result. To test the robustness of our approach, we add Gaussian noise to the multifocus images. In Figure 10, the results for the approach tested are shown for . Note that the SFDCT method needs the denosing preprocessing, and then fuses multifocus images. Figure 10(c) shows circle blurring effect around strong boundaries. The image (see Figure 10(d)) also shows the blurring effect for the SRKSVD method. Our approach is capable of providing restoration and fusion simultaneously, and it performs the best as it visually appears in Figure 10(e).
More objectively, we test the impact of different parameters selection on the proposed approach. The objective evaluation is based on the following two stateoftheart fusion performance metrics: [32], which measures how well the mutual information from the source images is preserved in the fused image; and [33], which evaluates how well the edge information transfers from the source images to the fused image. The values of and range from to , with representing the ideal fusion. First, we conduct several experiments for different patch sizes, and compare the performance in the noisefree () and noisy () cases for the aforementioned methods in Figure 12. The employed patch sizes are . In either case , the values of (see Figure 12(a)) and (see Figure 12(b)) for the proposed approach are always larger than for the SFDCT and SRKSVD methods. It means that our approach preserves well the mutual information and transfers efficiently the edge information from source images. When patch size is , the values of and are optimal. Thus, we set the patch size , and also conduct fusion experiments with different noise levels . The results are shown in Figure 14. It can be seen that all the methods tested show larger values when is equal to zero. With the increase of the noise level, the values of and gradually decrease, while the proposed method performs the best. Table 1 presents the average running time of the aforementioned methods. As expected, the proposed approach achieves the restoration and fusion with highquality in reasonable time.


SFDCT  SRKSVD  Ours  SFDCT  SRKSVD  Ours  
1.073  4.342  4.567  2.179  5.281  5.981  
5Conclusion
A novel fusion approach for combining multifocus noisy images into a higher quality allinfocus image based on analysis sparse model has been presented. Using the cosparsity prior assumption, we have proposed an analysis operator learning approach based on ADMM. Furthermore, an efficient fusion processing via the learned analysis operator has been presented. Extensive experiments have demonstrated that the proposed approach can fuse images with remarkably highquality, and have confirmed the highly competitive performance of our proposed algorithm. As a future work, a more flexible penalty function can be employed in the fusion problem, which can possibly lead to even better results.
References
 A. A. Goshtasby, and S. Nikolov, “Image fusion: advances in the state of the art,” Inf. Fusion, vol. 8, no. 2, pp. 114–118, Apr. 2007.
 T. Wan, C. Zhu, and Z. Qin, “Multifocus image fusion based on robust principal component analysis,” Pattern Recognit. lett., vol. 34, no. 9, pp. 1001–1008, Jul. 2013.
 Q. Zhang, and M. D. Levine, “Robust multiFocus image fusion using multitask sparse representation and spatial context,” IEEE Trans. Image Process., vol. 25, no. 5, pp. 2045–2058, May. 2016.
 V. N. Gangapure, S. Banerjee, and A. S. Chowdhury, “Steerable local frequency based multispectral multifocus image fusion,” Inf. Fusion, vol. 23, pp. 99–115, May. 2015.
 L. Cao, L. Jin, H. Tao, G. Li, Z. Zhuang, and Y. Zhang, “Multifocus image fusion based on spatial frequency in discrete cosine transform domain,” IEEE Signal Process. Lett., vol. 22, no. 2, pp. 220–224, Feb. 2015.
 P. Burt and E. Adelson, “The laplacian pyramid as a compact image code,” IEEE Trans. Commun., vol. 31, no. 4, pp. 532–540, Apr. 1983.
 V. Aslantas, and R. Kurban, “Fusion of multifocus images using differential evolution algorithm,” Expert Syst. Appl., vol. 37, no. 12, pp. 8861–8870, Dec. 2010.
 S. Li, and B. Yang, “Multifocus image fusion using region segmentation and spatial frequency,” Inf. Fusion, vol. 26, no. 7, pp. 971–979, Jul. 2008.
 Z. Zhou, S. Li, and B. Wang, “Multiscale weighted gradientbased fusion for multifocus images,” Image Vision Comput., vol. 20, pp. 60–72, Nov. 2014.
 J. Tian, and L. Chen, “Adaptive multifocus image fusion using a waveletbased statistical sharpness measure,” Signal Process., vol. 92, no. 9, pp. 2137–2146, Sep. 2012.
 A. L. Da Cunha, J. Zhou, and M. N. Do, “The nonsubsampled contourlet transform: theory, design, and applications,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 3089–3101, Oct. 2006.
 O. Rockinger, “Image sequence fusion using a shiftinvariant wavelet transform,” in Proc. IEEE Int. Image Process., Santa Barbara, CA, 1997, vol. 3, pp. 288–291.
 Q. Zhang, and B. Guo, “Multifocus image fusion using the nonsubsampled contourlet transform,” Signal Process., vol. 89, no. 7, pp. 1334–1346, Jul. 2009.
 S. Ambat, S. Chatterjee, and K. Hari, “Fusion of algorithms for compressed sensing,” IEEE Trans. Signal Process., vol. 61, no. 14, pp. 3699–3704, May. 2010.
 H. Li, L. Li, and J. Zhang, “Multifocus image fusion based on sparse feature matrix decomposition and morphological filtering,” Opt. Commun., vol. 342, pp. 1–11, May. 2015.
 B. Yang, and S. Li, “Multifocus image fusion and restoration with sparse representation,” IEEE Trans. Instrum. Meas., vol. 59, no. 4, pp. 884–892, Apr. 2010.
 M. Nejati, S. Samavi, and S. hirani, “Multifocus image fusion using dictionarybased sparse representation,” Inf. Fusion, vol. 25, pp. 72–84, Sep. 2015.
 R. Gao, S. A. Vorobyov, and H. Zhao, “Multifocus image fusion via coupled dictionary training,” in Proc. IEEE 41st Int. Conf. Acoustics, Speech and Signal Process., Shanghai, China, 2016, pp. 1666–1670.
 M. Elad, P. Milanfar, and R. Rubinstein, “Analysis versus synthesis in signal priors,” Inv. Probl., vol. 23, no. 3, pp. 947–968, Jun. 2007.
 J. Dong, W. Wang, W. Dai, M. D. Plumbley, Z. Han, and J. Chambers, “Analysis SimCO Algorithms for Sparse Analysis Model Based Dictionary Learning,” IEEE Trans. Signal Process., vol. 64, no. 2, pp. 417–431, Jan. 2016.
 M. Seibert, J. Wörmann, R. Gribonval, and M. Kleinsteuber, “Learning coSparse analysis operators with separable structures,” IEEE Trans. Signal Process., vol. 64, no. 1, pp. 120–130, Jan. 2016.
 S. Nam, M. E. Davies, M. Elad, and R. Gribonval, “The cosparse analysis model and algorithms,” Appl. Comput. Harmon. Anal., vol. 34, no. 1, pp. 30–56, Jan. 2013.
 R. Rubinstein, T. Peleg, and M. Elad, “Analysis KSVD: A dictionarylearning algorithm for the analysis sparse model,” IEEE Trans. Signal Process., vol. 61, no. 3, pp. 661–677, Feb. 2013.
 M. Yaghoobi, S. Nam, R. Gribonval, and M. E. Davies, “Constrained overcomplete analysis operator learning for cosparse signal modelling,” IEEE Trans. Signal Process., vol. 61, no. 9, pp. 2341–2355, May. 2013.
 S. Hawe, M. Kleinsteuber, and K. Diepold, “Analysis operator learning and its application to image reconstruction,” IEEE Trans. Image Process., vol. 22, no. 6, pp. 2138–2150, Feb. 2013.
 S. Pertuz, D. Puig, M. A. Garcia, and A. Fusiello, “Generation of allinfocus images by noiserobust selective fusion of limited depthoffield images,” in IEEE Trans. Image Process., vol. 22, no. 3, pp. 1242–1251, Mar. 2013.
 M. Subbarao, T. Choi, and A. Nikzad, “Focusing techniques,” Opt. Eng., vol. 32, pp. 2824–2836, Mar. 1993.
 J. Mairal, “Incremental majorizationminimization optimization with application to largescale machine learning,” SIAM J. Optim., vol. 25, no. 2, pp. 829–855, Apr. 2015.
 J. Eckstein, and D. Bertsekas, “On the DouglasRachford splitting method and the proximal point algorithm for maximal monotone operators,” Math. Program., vol. 55, no. 3, pp. 293–318, Nov. 1992.
 S. Xie, and S. Rahardja, “Alternating direction method for balanced image restoration,” IEEE Trans. Image Process., vol. 21, no. 11, pp. 4557–4567, Nov. 2012.
 M. Nejati, S. Samavi, and S. Shirani, “Multifocus image fusion using dictionarybased sparse representation,” Inf. Fusion, vol. 25, pp. 72–84, Sep. 2015.
 M. Hossny, S. Nahavandi, and D. Creighton, “Comments on information measure for performance of image fusion,” Electron. Lett., vol. 44, no. 18, pp. 1066–1067, Aug. 2008.
 C. Xydeas, and V. Petrović, “Objective image fusion performance measure,” Electron. Lett., vol. 36, no. 4, pp. 308–309, Feb. 2000.