Robust fusion of multiband images with different spatial and spectral resolutions for change detection
Abstract
Archetypal scenarios for change detection generally consider two images acquired through sensors of the same modality. However, in some specific cases such as emergency situations, the only images available may be those acquired through different kinds of sensors. More precisely, this paper addresses the problem of detecting changes between two multiband optical images characterized by different spatial and spectral resolutions. This sensor dissimilarity introduces additional issues in the context of operational change detection. To alleviate these issues, classical change detection methods are applied after independent preprocessing steps (e.g., resampling) used to get the same spatial and spectral resolutions for the pair of observed images. Nevertheless, these preprocessing steps tend to throw away relevant information. Conversely, in this paper, we propose a method that more effectively uses the available information by modeling the two observed images as spatial and spectral versions of two (unobserved) latent images characterized by the same high spatial and high spectral resolutions. As they cover the same scene, these latent images are expected to be globally similar except for possible changes in sparse spatial locations. Thus, the change detection task is envisioned through a robust multiband image fusion method which enforces the differences between the estimated latent images to be spatially sparse. This robust fusion problem is formulated as an inverse problem which is iteratively solved using an efficient blockcoordinate descent algorithm. The proposed method is applied to real panchormatic/multispectral and hyperspectral images with simulated realistic changes. A comparison with stateoftheart change detection methods evidences the accuracy of the proposed strategy.
I Introduction
\PARstartRemote sensing is a reliable technique for Earth surface monitoring and observation [2, 3]. One of the most important applications using remotely sensed data is the socalled change detection (CD) problem. CD has many definitions and it is generally considered as the ability of analyzing two or more multidate (i.e., acquired at different time instants) and possibly multisource (i.e., acquired by different sensors) images of the same scene to detect areas where potential changes have occurred [4, 5]. Because of the increasing number of satellites and of new policies for data distribution, more multitemporal data becomes available. While it increases the amount of information on the present scene, it highlights some additional issues when designing operational change detection techniques.
Each remotely sensed observation image is intimately connected to the acquisition modality providing a particular excerpt of the observed scene according to the sensor specifications. For instance, optical images are generally well suited to map horizontal structures, e.g., landcover type at large scales [6]. More particularly, remote sensing images acquired by multiband optical sensors can be classified according to their spectral and spatial resolutions. The spectral resolution is related to the capability in sensing the electromagnetic spectrum. This term can also refer to the number of spectral bands [7, 3], which generally leads to a commonly adopted classification of these images: panchromatic (PAN) images, characterized by a low spectral resolution, multispectral (MS) and hyperspectral (HS) images which sense part of the spectrum with higher precision. Alternatively, multiband optical images can be classified with respect to (w.r.t.) their spatial resolution [6, 3]. The concept of spatial resolution should be understand as the capability of representing the smallest object that can be resolved up to a specific pixel size. Images having small resolution size and finer details are generally identified as of (high resolution (HR) in contrast to low resolution (LR) images where only coarse features are observable. Because of the physical limitations of optical passive sensors, multiband optical images suffer from a tradeoff between spectral and spatial resolution [8, 2]. To ensure that any sensor has sufficient amount of energy to guarantee a proper acquisition (in terms of, e.g., signaltonoise ratio), one of the resolutions must be decreased allowing the other to be increased. For this reason, PAN images are generally characterized by higher spatial resolution and lower spectral resolution than MS or a HS images.
Optical images have been the most studied remote sensing modality for CD since the widely admitted additive Gaussian modeling of optical optical sensor noises allows CD techniques to be implemented through a simple operation of image differencing [4, 5]. Originally designed for singleband images, CD differencing methods have been adapted to handle multiband images by considering spectral change vectors [9, 10] and transform analysis [11, 12]. The possibility of detecting changes by exploiting both spatial and spectral information is one of the greatest advantages of these multiband images. Nevertheless, images of same modality are not always available. In some specific scenarios, for instance consecutive to natural disasters, the availability of data imposes observation images acquired through different kind of sensors. Such disadvantageous emergency situations yet require fast, flexible and accurate methods able to handle also the incompatibilities introduced by the each sensor modality [13, 14, 15, 16]. Most of the CD classical methods do not support differences in resolutions. Generally, each observed image is independently preprocessed in order to get the same resolution and then classical CD techniques are applied. However, independent resampling operations do not take into account the pair of observed images and even throw away important information. Recently, a general CD framework has been proposed in [17] to deal with multiband images with different spatial and spectral resolutions based on a step procedure (fusion, prediction, detection). Instead of independently preprocessing each observed image, this approach consists in recovering a latent (i.e., unobserved) HRHS image containing changed and unchanged regions by fusing both observed images. Then, it predicts pseudoobserved images by artificially degrading the estimated HRHS latent image using the same forward models underlying the actually observed images. As the pairs of predicted and observed observations have the same spatial and spectral resolutions, any classical multiband CD method can be finally applied to build a change map. Albeit significantly improving detection performance when compared to crude methods relying on independent preprocessing, the step sequential formulation appears to be nonoptimal for the following twofold reasons: i) any inaccuracies in the fusion step are propagated throughout the subsequent degradation and detection steps, ii) relevant information regarding the change may be lost during the prediction steps, since it consists in spatially or spectrally degrading the latent images to estimate the pseudoobserved images. Thus, significant improvements in terms of change detection performance may be expected provided one is able to overcome both limitations.
In this paper, capitalizing on the general framework developed in [17], we show that the CD task can be formulated as a particular instance of the multiband image fusion problem. However, contrary to the step procedure in [17], the proposed approach jointly estimates a couple of distinct HRHS latent images corresponding to the two acquisition times as well as the change image. Since the two HRHS latent images are supposed to represent the same scene, they are expected to share a high level of similarity or, equivalently, to differ only in a few spatial locations. Thus, akin to numerous robust factorizing models such as robust principal component analysis [18] and robust nonnegative matrix factorization [19], the two observed images are jointly approximated by a standard linear decomposition model complemented with an HRHS outlier term corresponding to the change image. This socalled CDdriven robust fusion of multiband images is formulated as an inverse problem where, in particular, the outlier term is characterized by a spatial sparsityinducing regularization. The resulting objective function is solved through the use of a block coordinate descent (BCD) algorithm, which iteratively optimizes w.r.t. one latent image and the change image. Remarkably, optimizing w.r.t. the latent image boils down to a classical multiband image fusion step and can be efficiently conducted following the algorithmic solutions proposed in [20]. The CD map can be finally generated from the recovered HRHS change image.
Ii From change detection to robust fusion
Iia Generic forward model
Let us consider the image formation process as a sequence of transformations, denoted , of the original scene into an output image. The output image of a particular sensor is referred to as the observed image and denoted where and are the numbers of pixels and spectral bands in the observed image, respectively. It provides a limited version of the original scene with characteristics imposed by the image signal processor (ISP) characterizing the sensor. The original scene can be conveniently represented by an (unknown) latent image of higher spatial and spectral resolutions, , where and are the numbers of pixels and spectral bands, respectively, related to the observed image following
(1) 
The intrinsic sequence of transformations of the sensor over the latent image can be typically classified as spectral or spatial degradations. On one hand, spatial degradations are related to the spatial characteristics of the sensor such as sampling scheme and optical transfer function. On the other hand, spectral degradations refer to the wavelength sensitivity and the spectral sampling. There are many ways to represent the degradation process. In this paper, is is considered as a sequence of linear operations leading to the following generic forward model [21, 22, 23]
(2) 
where

is the spectral degradation matrix,

is the spatial degradation matrix,

is the additive term comprising sensor noise and modeling errors.
In (2), the leftmultiplying matrix degrades the latent image by combination of some spectral bands for each pixel while the rightmultiplying matrix degrades the latent image by linear combination of pixels within the same spectral band. The former degradation corresponds to a spectral resolution reduction with respect to the latent image as in [22, 23, 20]. In practice, this degradation models an intrinsic characteristic of the sensor, namely the spectral response. It can be either learned by crosscalibration or known a priori [23, 24]. Conversely, the spatial degradation matrix models the combination of different transformations which are specific of the sensor architecture taking into account external factors including wrap, blurring, translation and decimation [24, 25, 20]. In this work, since geometrical transformations such as wrap and translations can be corrected using image coregistration techniques in preprocessing steps, only a spatially invariant blurring and a decimation (i.e., subsampling) will be considered. A spaceinvariant blur can be modeled by a symmetric convolution kernel associated with a sparse symmetric Toeplitz matrix which operates a cyclic convolution on the each individual band [26]. The decimation operation, denoted by the matrix , corresponds to a uniform downsampling operator^{1}^{1}1The corresponding operator represents an upsampling transformation by zerointerpolation from to . of factor with ones on the block diagonal and zeros elsewhere, such that [20]. To summarize, the overall spatial degradation process corresponds to the matrix composition .
The noise corrupting multiband optical images is generally modeled as additive and Gaussian [5, 2, 27, 20]. Thus the noise matrix in (2) is assumed to be distributed according to the following matrix normal distribution^{2}^{2}2The probability density function of a matrix normal distribution is given by [28] where is the mean matrix, is the row covariance matrix and is the column covariance matrix.
(3) 
The row covariance matrix carries information regarding the betweenband spectral correlation. Following [20], in what follows, this covariance matrix will be assumed to be diagonal, which implies that the noise is independent from one band to the other and characterized by a specific variance in each band. Conversely, the column covariance matrix models the noise correlation w.r.t. to the pixel locations. Following a widely admitted hypothesis of the literature, this matrix is assumed to be identity, , to reflect the fact the noise is spatially independent. In real applications, both matrices and can be estimated by previous calibrations [24].
IiB Problem statement
Let us denote and the acquisition times of two coregistered multiband optical images. It is not assumed any specific information about time ordering, either or are possible cases. Hence, without loss of generality, the HRPAN/MS image acquired at time is assumed to be a low spectral resolution (i.e., PAN or MS) image of high spatial resolution denoted . The image acquired at time is a LRHS image denoted . The problem addressed in this paper consists of detecting significant changes between these two images. This is a challenging task mainly due to the spatial and spectral resolution dissimilarity which prevents any use of simple yet efficient differencing operation [4, 5]. To alleviate this issue, this work proposes to generalize the CD framework introduced in [17]. More precisely, following the widely admitted forward model described in Section IIA and adopting consistent notations, the observed images and can be related to two HRHS latent images and , respectively, as follows
(4a)  
(4b) 
Note that (4a) and (4b) are a specific double instance of (2). Indeed, the HRPAN/MS (resp., LRHS) image (resp., ) is assumed to be only a spectrally (resp., spatially) degraded version of the HRHS latent image (resp., ) such that both latent images and share the same spectral and spatial resolutions which correspond to the highest resolutions of both observed images. Thereby, provided these two latent images can be efficiently inferred, any classical differencing technique can be subsequently implemented on them to detect changes, notably at a high spatial resolution. More specifically, it would consist of evaluating an HRHS change image denoted that would gather information related to any change between the two observed images
(5) 
where denotes the spectral change vector in the th pixel (). This spectral change image can be exploited by conducting a pixelwise change vector analysis (CVA) [29] which exhibits the polar coordinates (i.e., magnitude and direction) of the spectral change vectors. To spatially locate the changes, a natural approach consists of monitoring the information contained in the magnitude part of this representation [30, 9, 10], by considering the corresponding HR spectral change energy image
(6) 
with
(7) 
When the CD problem in the th pixel is formulated as the binary hypothesis testing
(8) 
the pixelwise statistical test can be written for a given threshold as
(9) 
The final binary HR CD map denoted can be derived as
(10) 
When complementary information needs to be extracted from the change image , e.g., to identify different types of changes, the whole polar representation (i.e., both magnitude and direction) can be fully exploited [9, 10]. As a consequence, to solve the multiband image CD problem, the key issue lies in the joint estimation of the pair of HRHS latent images from the forward model (4) or, equivalently, the joint estimation of one of this latent image and the difference image, e.g., . The next paragraph shows that this problem can be formulated as a particular instance of multiband image fusion.
IiC Robust multiband image fusion
Linear forward models similar to (4) have been extensively investigated in the image processing literature for various applications. When a unique LRHS image has been observed at time , recovering the HRHS latent image from the direct model (4b) can be cast as a superresolution problem [31, 32]. Besides, when a complementary HRPAN/MS image of lower spectral resolution (i.e., PAN or MS) has been simultaneously acquired at time under (4a), the two corresponding latent images are expected to represent exactly the same scene, i.e., or, equivalently, where the time index can be omitted. In such scenario, estimating the common HRHS latent image from the two observed images and is a multiband image fusion problem addressed in [33, 34, 35, 22, 23, 21, 26, 20], also referred to as MS or HS pansharpening in some specific cases [27]. Whether the problem consists in increasing the resolution of a single image or fusing multiple images of different spatial and spectral resolutions, the underlying objective consists in compensating the energy tradeoff of optical sensors to get highly spatially and spectrally resolved images. Those problems are often formulated as an inverse problem, which is generally illposed or, at least, illconditioned. To overcome this issue, a classical approach consists of penalizing the data fitting terms derived from the linear models (4) and the noise statistics (3) with additional regularizing terms exploiting any prior information on the latent image. Various penalizations have been considered in the literature, including Tikhonov regularizations expressed in the image domain [36, 21] or a in a transformed (e.g., gradient) domain [37, 38], dictionary or patchbased regularizations [31, 26], total variation (TV) [39, 23] or regularizations based on sparse wavelet representations [40, 41].
In this work, we propose to follow a similar route by addressing, in a first step, the CD problem as a linear inverse problem derived from (4). However, the CD problem addressed here differs from the computational imaging problems discussed above by the fact that two distinct HRHS latent images and need to be inferred, which makes the inverse problem highly illposed. However, this particular applicative scenario of CD yields a natural reparametrization where relevant prior knowledge can be conveniently exploited. More precisely, since the two HRHS latent images are related to the same scene observed at two time instants, they are expected to share a high level of similarity, i.e., the change image is expected to be spatially sparse. Thus, instead of jointly estimating the pair of HRHS latent images, we take benefit from this crucial information to rewrite the joint observation model (4) as a function of , i.e.,
(11a)  
(11b) 
It is worthy to note that this dual observation model parametrized by the new pair of images to be inferred can be straightforwardly associated with a particular instance of the multiband image fusion discussed earlier. Indeed, given the HRHS change image and the HRPAN/MS image observed at time , an HRPAN/MS corrected image denoted that would be acquired by the HRPAN/MS sensor at time can be defined as
(12) 
In such case, the HR forward model (11a) can be easily rewritten, leading to
(13a)  
(13b) 
This observation model (13) defines a standard multiband image fusion problem for the LRHS observed image and the corrected HRPAN/MS image . Consequently, since the change image can be considered as an outlier term, akin to those encountered in several robust factorizing models such as robust principal component analysis (RPCA) [18] and robust nonnegative factorization [19] which relies on a similar sparse outlier term, the joint observation model (11) naturally defines a socalled robust fusion scheme whose objective function is detailed in the next paragraph.
IiD Robust fusion objective function
Because of the additive nature and the statistical properties of the noise and , both observed images and can be assumed matrix normally distributed
Besides, since both observations are acquired by different modality sensors, the noise, which is sensordependent, can be assumed statistically independent. Thus, and are also statistically independent and the joint likelihood function can be written as a simple product of the conditional distributions and .
A Bayesian formulation of the robust multiband image fusion problem allows prior information to be introduced to regularize the underlying estimation problem[42]. Bayesian estimators can be derived from the joint posterior distribution
(14) 
where and correspond to the prior distributions associated with the latent and change HRHS images, respectively, assumed to be a priori independent. Under a maximum a posteriori (MAP) paradigm, the joint MAP estimator can be derived by minimizing the negative logposterior, leading to the following minimization problem
(15) 
with
(16)  
The regularizing functions and can be related to the negative logprior distributions of the HRHS latent and change images, respectively, and the parameters and tune the amount of corresponding penalizations in the overall objective function . These functions should be carefully designed to exploit any prior knowledge regarding the parameters of interest. As discussed in Section IIC, numerous regularizations can be advocated for the HRHS latent image . In this work, a Tikhonov regularization proposed in [21] has been adopted
(17) 
where refers to a crude estimate of , e.g., resulting from a naive spatial interpolation of the observed LRHS image . This choice has been proven to maintain computational efficiency while providing accurate results [27]. Additionally, a subspacebased representation can also be adopted to enforce to live in a previously identified subspace, as advocated in [43] and [23].
Conversely and more critically, a specific attention should be paid to the regularizing function . This function should reflect the fact that most of the pixels are expected to remain unchanged in and , i.e., most of the columns of the change image are expected to be null vectors. This noticeable property can be easily translated by promoting the sparsity of the spectral change energy image defined by (6). As a consequence, the regularizing function is chosen as the sparsityinducing norm of the change energy image or, equivalently, as the norm of the change image
(18) 
This regularization is a specific instance of the nonoverlapping grouplasso penalization [44] which has been considered in various applications to promote structured sparsity [45, 46, 47, 48, 49, 50, 19].
The next section describes an iterative algorithm which solves the minimization problem in (15).
Iii Minimization algorithm
Computing the joint MAP estimator of the HRHS latent image at time and of the change image can be achieved by solving the minimization problem in (15). However, no closedform solution can be derived for this problem. Thus this section presents a minimization algorithm which iteratively converges to this solution. It consists in sequentially solving the problem w.r.t. to each individual variables and . This block coordinate descent algorithm is summarized in Algo. 1 whose main steps (fusion and correction) are detailed in what follows.
Iiia Fusion: optimization w.r.t
At the th iteration of the BCD algorithm, let assume that the current value of the HRHS change image is denoted . As suggested in Section IIC, an HRPAN/MS corrected image that would be observed at time given the HRPAN/MS image observed at time and the HRHS change image can be introduced as
(19) 
Updating the current value of the HRHS latent image consists in minimizing w.r.t. the partial function
(20)  
As noticed earlier, this subproblem boils down to the multiband image fusion which has received considerable attention in the recent image processing and remote sensing literature [23, 27, 43, 21, 20, 26]. The two difficulties arising from this formulation lies in the high dimension of the optimization problem and in the fact that the subsampling operator prevents any fast resolution in the frequency domain by diagonalization of the spatial degradation matrix . However, with the particular choice (17) of the regularization function adopted in this paper, a closedform solution can still be derived and efficiently implemented. It consists in solving a matrix Sylvester equation [20] of the form
(21) 
where the matrices , and depend on the quantities involved in the problem, i.e., the virtual and observed images, the degradation operators, the noise covariance matrices and the spatially interpolated image defined in (17) (see [20] for more details). Note that when a more complex regularization function is considered (e.g., TV or sparse representation over a dictionary), iterative algorithmic strategies can be adopted to approximate the minimizer of .
IiiB Correction: optimization w.r.t
Following the same strategy as in [17], let introduce the predicted HRPAN/MS image
(22) 
that would be observed at time index by the HRPAN/MS sensor given its spectral response and the current state of the HRHS latent image at the th iteration of the BCD algorithm. Similarly to (5), the predicted HRPAN/MS change image can thus be defined as
(23) 
The objective function (16) w.r.t is then rewritten by combining (22) and (23) with (16), leading to
(24)  
With the specific CDdriven choice of in (18), minimizing is an penalized least square problem. It is characterized by the sum of a convex and differentiable data fitting term with Lipschitz continuous gradient
(25) 
and a convex but nonsmooth penalization
(26) 
Various algorithms have been proposed to solve such convex optimization problems including forwardbackward splitting [51, 52], DouglasRachford splitting [53, 52] and alternating direction method of multipliers [54, 55]. Since the proximal operator related to can be efficiently computed (see below), in this work, we propose to resort to an iterative forwardbackward algorithm which has shown to provide the fastest yet reliable results. This algorithmic scheme is summarized in Algo. 2. It relies on a forward step which consists in conducting a gradient descent using the datafitting function in (25), and a backward step relying on the proximal mapping associated with the penalizing function in (26).
Since the HRPAN/MS observed image has only a few spectral bands (e.g., ), the spectral degradation matrix is a fat (and generally fullrow rank) matrix. Thus, the corresponding gradient operator defining the forward step (see line 5 of Algo. 2) can be easily and efficiently computed. Conversely, the proximal operator associated with in (26) and required during the backward step (see line 7 of Algo. 2) is defined as
(27) 
for some . The function in (26) can be split as with, for each column, . Based on the separability property of proximal operators [55], the operator (27) can be decomposed and computed for each pixel location () as
(28) 
where the notations stands for the th column. Thus, only the proximal operator associated with the Euclidean distance induced by the norm is necessary. The Moreau decomposition [55]
(29) 
establishes a relationship between the proximal operators of the function and its conjugate . When the function is a general norm, its conjugate corresponds to the indicator function into the ball defined by its dual norm [48, 55], leading to
(30) 
where denotes the projection. When is defined by (26), since the norm is selfdual, this projection is
(31) 
Consequently, replacing (31) in (30), the proximal operator associated with the function in (28) is
(32) 
To conclude, the correction procedure can be interpreted as first a gradient descent step for spectral deblurring of the HRHS change image from the HRPAN/MS predicted change image (forward step), followed by a softthresholding of the resulting HRHS change image to promote spatial sparsity (backward step).
Iv Experiments
Iva Simulation framework
Real dataset for assessing performance of CD algorithms is rarely available. Indeed, this assessment requires a couple of images acquired at two different dates, geometrically and radiometrically precorrected, presenting changes and, for the scenario considered in this paper, coming from two different optical sensors. To alleviate this issue, inspired by the wellknown Wald’s evaluation protocol dedicated to pansharpening algorithms [56], a framework has been proposed in [17] to assess the performance of CD algorithms when dealing with optical images of different spatial and spectral resolutions. This framework only requires a single HRHS reference image and generates a pair of latent HRHS images and resulting from a unmixingmixing process. This process allows synthetic yet realistic changes to be incorporated within one of these latent images, w.r.t. a predefined binary reference HR change mask locating the pixels affected by these changes and further used to assess the performance of the CD algorithms. This procedure allows various physicallyinspired changes to be considered, e.g., by tuning the relative abundance of a each endmember or replacing one of them my another. This protocol is briefly described below (see [17] for more details).
IvA1 Reference image
The HRHS reference image used in the experiments reported in this paper is a HS image of the Pavia University, Italy, acquired by the reflective optics system imaging spectrometer (ROSIS) sensor. This image has undergone a preprecessing to smooth the atmospheric effects of vapor water absorption by removing some bands. Thus the final HRHS reference image is of size .
IvA2 Generating the changes
Using the same procedure proposed in [17], the HRHS reference image has been linearly unmixed to define the reference matrix of endmember spectral signatures and the corresponding reference abundance matrix such that . The two latent HRHS images and are then computed as linear mixture of the endmembers in with corresponding abundance matrices and , respectively, derived from the reference abundances and the change mask , i.e.,
with
where the two changeinducing functions are defined to simulate realistic changes in some pixels of the HRHS latent images. Three sets of predefined change masks have been designed according to three specific change rules introduced in [17]. For each simulated pair , one of the two functions is defined as a “nochange” operator, i.e., , which leads to an overall set of simulated pairs of HRHS latent images.
IvA3 Generating the observed images
The HRPAN/MS observed image is obtained by spectrally degrading the corresponding HRHS latent image . Two scenarios are considered. Scenario 1 consists in averaging the first bands of the HRHS latent image to produce an HRPAN image. Conversely, Scenario 2 considers an HRMS image by spectrally filtering the HRHS latent image with a band LANDSATlike spectral response. Moreover, to generate a spatially degraded image , the respective latent image has been blurred by a Gaussian kernel and subsequently equally downsampled in the vertical and horizontal directions with a downsampling ratio . To illustrate, Fig. 1 shows one of the simulation configurations used during the experiments.
IvB Compared methods
The proposed robust fusionbased CD technique has been compared to four methods able to deal with optical images of different spatial and spectral resolutions. The first one has been proposed in [17] and also relies on a fusionbased approach. Up to the authors’ knowledge, it was the first operational CD technique able to operate with multiband optical images of different spatial and spectral images. Contrary to the model (4) proposed in this paper, it consists in recovering a common latent image by fusing the two observed images and then predicting an HRPAN/MS image from the underlying forward model. An HRPAN/MS change image has been then computed as in (5) from the pair of HRPAN/MS observed and predicted images . Finally, as recommended in [17], a spatiallyregularized CVA (sCVA) similar to the decision rule detailed in Section IIB has been conducted on to produce an estimated HR CD mask denoted .
The second method aims at producing an HRPAN/MS predicted image by successive spatial superresolution and spectral degradation. More precisely, an HRHS latent image is first recovered by conducting a bandwise spatial superresolution of the observed LRHS following the fast method in [32]. Then this latent image is spectrally degraded according to produce an HRPAN/MS predicted image . Similarly to the previous fusionbased method, sCVA has been finally conducted on the pair to produce an HR CD mask denoted . The third CD method applies the same procedure with a reverse order of spatial superresolution and spectral degradation, and produces produces an HR change mask denoted from the pair of HRPAN/MS images . The fourth CD method, referred to as the worstcase (WC) as in [17], build a LR change mask by crudely conducting a sCVA on a spatially degraded version of the HRPAN/MS image and a spectrally degraded version of the LRHS image.
IvC Figuresofmerit
The CD performances of these four methods, as well as the performance of the proposed robust fusionbased method whose HR change mask is denoted , have been visually assessed from empirical receiver operating characteristics (ROC), representing the estimated pixelwise probability of detection () as a function of the probability of false alarm (). Moreover, two quantitative criteria derived from these ROC curves have been computed, namely, ) the area under the curve (AUC), corresponding to the integral of the ROC curve and ) the distance between the no detection point and the point at the interception of the ROC curve with the diagonal line defined by . For both metrics, greater the criterion, better the detection.
Scenario 1 
AUC  

Dist.  
Scenario 2 
AUC  
Dist. 
IvD Results
IvD1 Scenario 1 (HRPAN vs. LRHS)
The ROC curves depicted in Fig. 2 with corresponding metrics in Table I (first two rows) correspond to the CD results obtained from a pair of HRPAN and LRHS observed images. Clearly, the proposed robust fusionbased CD technique outperforms the four other CD techniques. More importantly, it provides almost perfect detections even for very low PFA, i.e., for very low energy changes. Note that the CD mask estimated by the worstcase method is defined at an LR.
IvD2 Scenario 2 (HRMS vs. LRHS)
Applying the same procedure of Scenario 1 but now considering an HRMS observed image instead of the HRPAN observed image leads to very similar overall performance. The ROC plot is depicted in Fig. 3 with corresponding metrics in Table I (last two rows). As in Scenario 1, comparing curves in Fig. 2 shows that the proposed method offers a higher precision even when analyzing a lower spectral resolution HR observed image.
As an additional result, for Scenario 2, Fig. 4 compares the abilities of detecting changes of decreasing size of the proposed method against the fusionbased CD method [17] and the worstcase CD method. Figure 3(a) and 3(b) shows the observed image pair and containing multiple changes with size varying from  pixel to pixels, with the corresponding change mask presented in Fig. 3(c). Figures 3(d) and 3(e) present the change masks and recovered by the two competing methods, respectively, while the CD mask recovered by the proposed robust fusionbased method is reported in Fig. 3(f) shows the proposed CD. For each technique, the decision threshold required in the CVA in (10) has been tuned to reach the higher distance value in the corresponding ROC curves. The first advantage of the proposed method is a significant decrease of the number of false alarm which are due to propagated errors when implementing the two other methods. Moreover, these results proves once again that the proposed method achieves a better detection rate with a higher resolution, even when considering extremely localized change regions. Remaining false alarms only occur near edges between change and nochange regions of small size due to the difference of spatial resolutions and the width of the blur kernel. Note also that the CD mask estimated by the worstcase method is of coarse scale since based on the comparison of two LRMS images.
V Conclusion
This paper proposed a robust fusionbased change detection technique to handle two multiband optical observed images of different spatial and spectral resolutions. The technique was based on the definition of two high resolution hyperspectral latent images related to the observed images via a double physicallyinspired forward model. The difference between these two latent images was assumed to be spatially sparse, implicitly locating the changes at a high resolution scale. Inferring these two latent images was formulated as an inverse problem which was solved within a step iterative scheme. This algorithmic strategy amounted to solve a standard fusion problem and an penalized spectral deblurring step. Contrary to the methods already proposed in the literature, modeling errors were not anymore propagate inbetween steps. A simulation protocol allowed the performance of the proposed technique in terms of detection and precision to be assessed and compared with the performance of various algorithms. The detection rate as well as the accuracy of this method was clearly improved by this robustfusion based algorithm. Future works include the detection of change between optical and nonoptical data.
References
 [1] V. Ferraris, N. Dobigeon, Q. Wei, and M. Chabert, “Change detection between multiband images using a robust fusionbased approach,” in Proc. IEEE Int. Conf. Acoust., Speech and Signal Process. (ICASSP), 2016, submitted.
 [2] C. Elachi and J. Van Zyl, Introduction to the physics and techniques of remote sensing, 2nd ed., ser. Wiley series in remote sensing. Hoboken, N.J.: WileyInterscience, 2006.
 [3] J. B. Campbell and R. H. Wynne, Introduction to remote sensing, 5th ed. New York: Guilford Press, 2011.
 [4] A. Singh, “Review Article Digital change detection techniques using remotelysensed data,” Int. J. Remote Sens., vol. 10, no. 6, pp. 989–1003, June 1989.
 [5] F. Bovolo and L. Bruzzone, “The time variable in data fusion: A change detection perspective,” IEEE Geosci. Remote Sens. Mag., vol. 3, no. 3, pp. 8–26, Sept. 2015.
 [6] M. Dalla Mura, S. Prasad, F. Pacifici, P. Gamba, J. Chanussot, and J. A. Benediktsson, “Challenges and Opportunities of Multimodality and Data Fusion in Remote Sensing,” Proc. IEEE, vol. 103, no. 9, pp. 1585–1601, Sept. 2015.
 [7] D. Landgrebe, “Hyperspectral image data analysis,” IEEE Signal Process. Mag., vol. 19, no. 1, pp. 17–28, 2002.
 [8] J. C. Price, “Spectral band selection for visiblenear infrared remote sensing: spectralspatial resolution tradeoffs,” IEEE Trans. Geosci. Remote Sens., vol. 35, no. 5, pp. 1277–1285, 1997.
 [9] F. Bovolo and L. Bruzzone, “A theoretical framework for unsupervised change detection based on change vector analysis in the polar domain,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 1, pp. 218–236, Jan. 2007.
 [10] F. Bovolo, S. Marchesi, and L. Bruzzone, “A framework for automatic and unsupervised detection of multiple changes in multitemporal images,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 6, pp. 2196–2212, June 2012.
 [11] A. A. Nielsen, K. Conradsen, and J. J. Simpson, “Multivariate alteration detection (MAD) and MAF postprocessing in multispectral, bitemporal image data: New approaches to change detection studies,” Remote Sens. Environment, vol. 64, no. 1, pp. 1–19, 1998.
 [12] A. A. Nielsen, “The Regularized Iteratively Reweighted MAD Method for Change Detection in Multi and Hyperspectral Data,” IEEE Trans. Image Process., vol. 16, no. 2, pp. 463–478, Feb. 2007.
 [13] J. Inglada, “Similarity measures for multisensor remote sensing images,” in Proc. IEEE Int. Conf. Geosci. Remote Sens. (IGARSS), vol. 1. IEEE, 2002, pp. 104–106.
 [14] V. Alberga, M. Idrissa, V. Lacroix, and J. Inglada, “Performance estimation of similarity measures of multisensor images for change detection applications,” in Proc. IEEE Int. Workshop Analysis Multitemporal Remote Sensing Images (MultiTemp). Leuven: IEEE, 2007, pp. 1 – 5.
 [15] G. Mercier, G. Moser, and S. Serpico, “Conditional copula for change detection on heterogeneous sar data,” in Proc. IEEE Int. Conf. Geosci. Remote Sens. (IGARSS). IEEE, 2007, pp. 2394–2397.
 [16] J. Prendes, M. Chabert, F. Pascal, A. Giros, and J.Y. Tourneret, “A new multivariate statistical model for change detection in images acquired by homogeneous and heterogeneous sensors,” IEEE Trans. Image Process., vol. 24, no. 3, pp. 799–812, 2015.
 [17] V. Ferraris, N. Dobigeon, Q. Wei, and M. Chabert, “Detecting changes between optical images of different spatial and spectral resolutions: a fusionbased approach,” 2016, submitted.
 [18] E. J. Candés, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?” Journal of the ACM (JACM), vol. 58, no. 3, p. 11, 2011.
 [19] C. Févotte and N. Dobigeon, “Nonlinear hyperspectral unmixing with robust nonnegative matrix factorization,” IEEE Trans. Image Process., vol. 24, no. 12, pp. 4810–4819, 2015.
 [20] Q. Wei, N. Dobigeon, and J.Y. Tourneret, “Fast Fusion of MultiBand Images Based on Solving a Sylvester Equation,” IEEE Trans. Image Process., vol. 24, no. 11, pp. 4109–4121, Nov. 2015.
 [21] ——, “Bayesian Fusion of MultiBand Images,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 6, pp. 1117–1127, Sept. 2015.
 [22] N. Yokoya, T. Yairi, and A. Iwasaki, “Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 2, pp. 528–537, Feb. 2012.
 [23] M. Simões, J. Bioucas Dias, L. Almeida, and J. Chanussot, “A convex formulation for hyperspectral image superresolution via subspacebased regularization,” IEEE Trans. Geosci. Remote Sens., vol. 6, no. 53, pp. 3373–3388, June 2015.
 [24] N. Yokoya, N. Mayumi, and A. Iwasaki, “CrossCalibration for Data Fusion of EO1/Hyperion and Terra/ASTER,” IEEE J. Sel. Topics Appl. Earth Observations Remote Sens., vol. 6, no. 2, pp. 419–426, April 2013.
 [25] F. Heide, O. Gallo, M. Steinberger, J. Liu, Y.T. Tsai, W. Heidrich, M. Rouf, K. Egiazarian, D. Pajak, J. Kautz, D. Reddy, and K. Pulli, “FlexISP: A Flexible Camera Image Processing Framework,” ACM Transactions on Graphics (TOG)  Proceedings of ACM SIGGRAPH Asia 2014, vol. 33, no. 6, 2014.
 [26] Q. Wei, J. BioucasDias, N. Dobigeon, and J.Y. Tourneret, “Hyperspectral and multispectral image fusion based on a sparse representation,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 7, pp. 3658–3668, 2015.
 [27] L. Loncan, L. B. de Almeida, J. M. BioucasDias, X. Briottet, J. Chanussot, N. Dobigeon, S. Fabre, W. Liao, G. A. Licciardi, M. Simoes, J.Y. Tourneret, M. A. Veganzones, G. Vivone, Q. Wei, and N. Yokoya, “Hyperspectral pansharpening: A review,” IEEE Geosci. Remote Sens. Mag., vol. 3, no. 3, pp. 27–46, Sept. 2015.
 [28] A. K. Gupta and D. K. Nagar, Matrix Variate Distribution, ser. Monographs and Surveys in Pure and Applied Mathematics. Chapman and Hall, 1999, no. 104.
 [29] R. D. Johnson and E. S. Kasischke, “Change vector analysis: A technique for the multispectral monitoring of land cover and condition,” Int. J. Remote Sens., vol. 19, no. 3, pp. 411–426, Jan. 1998.
 [30] A. Singh, “Digital change detection techniques using remotelysensed data,” Int. J. Remote Sens., vol. 10, no. 6, pp. 989–1003, 1989.
 [31] Jianchao Yang, J. Wright, T. S. Huang, and Yi Ma, “Image superresolution via sparse representation,” IEEE Trans. Image Process., vol. 19, no. 11, pp. 2861–2873, Nov. 2010.
 [32] N. Zhao, Q. Wei, A. Basarab, N. Dobigeon, D. Kouame, and J.Y. Tourneret, “Fast Single Image SuperResolution Using a New Analytical Solution for – Problems,” IEEE Trans. Image Process., vol. 25, no. 8, pp. 3683–3697, Aug. 2016.
 [33] R. C. Hardie, M. T. Eismann, and G. L. Wilson, “MAP estimation for hyperspectral image resolution enhancement using an auxiliary sensor,” IEEE Trans. Image Process., vol. 13, no. 9, pp. 1174–1184, Sept. 2004.
 [34] M. T. Eismann and R. C. Hardie, “Hyperspectral resolution enhancement using highresolution multispectral imagery with arbitrary response functions,” IEEE Trans. Image Process., vol. 43, no. 3, pp. 455–465, March 2005.
 [35] Y. Zhang, S. De Backer, and P. Scheunders, “Noiseresistant waveletbased Bayesian fusion of multispectral and hyperspectral images,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11, pp. 3834–3843, Nov. 2009.
 [36] M. Ebrahimi and E. R. Vrscay, “Regularization schemes involving selfsimilarity in imaging inverse problems,” in J. Physics: Conf. Series, 2008.
 [37] Y.W. Tai, S. Liu, M. S. Brown, and S. Lin, “Super resolution using edge prior and single image detail synthesis,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 2400 – 2407.
 [38] J. Sun, J. Sun, Z. Xu, and H.Y. Shum, “Gradient profile prior and its applications in image superresolution and enhancement,” IEEE Trans. Image Process., vol. 20, no. 6, pp. 1529 – 1542, 2011.
 [39] H. A. Aly and E. Dubois, “Image upsampling using totalvariation regularization with a new observation model,” IEEE Trans. Image Process., vol. 14, no. 10, pp. 1647–1659, 2005.
 [40] C. V. Jiji, M. V. Joshi, and S. Chaudhuri, “Singleframe image superresolution using learned wavelet coefficients,” Int. J. Imaging Syst. Technol., vol. 14, pp. 105–112, 2004.
 [41] J. M. BioucasDias, “Bayesian waveletbased image deconvolution: A GEM algorithm exploiting a class of heavytailed priors,” IEEE Trans. Image Process., vol. 15, no. 4, pp. 937–951, 2006.
 [42] J. Idier, Bayesian approach to inverse problems, J. Idier, Ed. ISTE ; Wiley, 2008.
 [43] Q. Wei, N. Dobigeon, and J.Y. Tourneret, “Bayesian fusion of multispectral and hyperspectral images using a block coordinate descent method,” in Proc. IEEE GRSS Workshop Hyperspectral Image SIgnal Process.: Evolution in Remote Sens. (WHISPERS), 2015.
 [44] F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Convex Optimization with SparsityInducing Norms, ser. Optimization for Machine Learning. MIT Press, 2011.
 [45] S. Cotter, B. Rao, Kjersti Engan, and K. KreutzDelgado, “Sparse solutions to linear inverse problems with multiple measurement vectors,” IEEE Trans. Signal Process., vol. 53, no. 7, pp. 2477–2488, July 2005.
 [46] C. Ding, D. Zhou, X. He, and H. Zha, “R1PCA: rotational invariant L1norm principal component analysis for robust subspace factorization,” in Proc. Int. Conf. Machine Learning (ICML), 2006, pp. 281–288.
 [47] J. Liu, S. Ji, and J. Ye, “Multitask feature learning via efficient L2,1norm minimization,” Uncertainty in Artificial Intelligence, 2009.
 [48] S. Wright, R. Nowak, and M. Figueiredo, “Sparse Reconstruction by Separable Approximation,” IEEE Trans. Signal Process., vol. 57, no. 7, pp. 2479–2493, July 2009.
 [49] F. Nie, H. Huang, X. Cai, and C. H. Ding, “Efficient and robust feature selection via joint l2, 1norms minimization,” in Advances in neural information processing systems, 2010, pp. 1813–1821.
 [50] H. Lu, X. Long, and J. Lv, “A fast algorithm for recovery of jointly sparse vectors based on the alternating direction methods,” in International Conference on Artificial Intelligence and Statistics, 2011, pp. 461–469.
 [51] P. L. Combettes and V. R. Wajs, “Signal recovery by proximal forwardbackward splitting,” Multiscale Modeling & Simulation, vol. 4, no. 4, pp. 1168–1200, 2005.
 [52] P. L. Combettes and J.C. Pesquet, “Proximal Splitting Methods in Signal Processing,” FixedPoint Algorithm for Inverse Problems in Science and Engineering, no. Springer Optimization and Its Applications, 2011.
 [53] ——, “A DouglasRachford splitting approach to nonsmooth convex variational signal recovery,” IEEE J. Sel. Topics Signal Process., vol. 1, no. 4, pp. 564–574, 2007.
 [54] S. Boyd, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010.
 [55] N. Parikh and S. Boyd, “Proximal Algorithms,” Foundations and Trends in Optimization, vol. 1, no. 3, pp. 123–231, 2013.
 [56] L. Wald, T. Ranchin, and M. Mangolini, “Fusion of satellite images of different spatial resolutions: assessing the quality of resulting images,” Photogrammetric engineering and remote sensing, vol. 63, no. 6, pp. 691–699, 1997.