Robust fusion of multi-band images with different spatial and spectral resolutions for change detection

Robust fusion of multi-band images with different spatial and spectral resolutions for change detection


Archetypal scenarios for change detection generally consider two images acquired through sensors of the same modality. However, in some specific cases such as emergency situations, the only images available may be those acquired through different kinds of sensors. More precisely, this paper addresses the problem of detecting changes between two multi-band optical images characterized by different spatial and spectral resolutions. This sensor dissimilarity introduces additional issues in the context of operational change detection. To alleviate these issues, classical change detection methods are applied after independent preprocessing steps (e.g., resampling) used to get the same spatial and spectral resolutions for the pair of observed images. Nevertheless, these preprocessing steps tend to throw away relevant information. Conversely, in this paper, we propose a method that more effectively uses the available information by modeling the two observed images as spatial and spectral versions of two (unobserved) latent images characterized by the same high spatial and high spectral resolutions. As they cover the same scene, these latent images are expected to be globally similar except for possible changes in sparse spatial locations. Thus, the change detection task is envisioned through a robust multi-band image fusion method which enforces the differences between the estimated latent images to be spatially sparse. This robust fusion problem is formulated as an inverse problem which is iteratively solved using an efficient block-coordinate descent algorithm. The proposed method is applied to real panchormatic/multispectral and hyperspectral images with simulated realistic changes. A comparison with state-of-the-art change detection methods evidences the accuracy of the proposed strategy.


sensing is a reliable technique for Earth surface monitoring and observation [2]. One of the most important applications using remotely sensed data is the so-called change detection (CD) problem. CD has many definitions and it is generally considered as the ability of analyzing two or more multi-date (i.e., acquired at different time instants) and possibly multi-source (i.e., acquired by different sensors) images of the same scene to detect areas where potential changes have occurred [4]. Because of the increasing number of satellites and of new policies for data distribution, more multi-temporal data becomes available. While it increases the amount of information on the present scene, it highlights some additional issues when designing operational change detection techniques.

Each remotely sensed observation image is intimately connected to the acquisition modality providing a particular excerpt of the observed scene according to the sensor specifications. For instance, optical images are generally well suited to map horizontal structures, e.g., land-cover type at large scales [6]. More particularly, remote sensing images acquired by multi-band optical sensors can be classified according to their spectral and spatial resolutions. The spectral resolution is related to the capability in sensing the electromagnetic spectrum. This term can also refer to the number of spectral bands [7], which generally leads to a commonly adopted classification of these images: panchromatic (PAN) images, characterized by a low spectral resolution, multispectral (MS) and hyperspectral (HS) images which sense part of the spectrum with higher precision. Alternatively, multi-band optical images can be classified with respect to (w.r.t.) their spatial resolution [6]. The concept of spatial resolution should be understand as the capability of representing the smallest object that can be resolved up to a specific pixel size. Images having small resolution size and finer details are generally identified as of (high resolution (HR) in contrast to low resolution (LR) images where only coarse features are observable. Because of the physical limitations of optical passive sensors, multi-band optical images suffer from a trade-off between spectral and spatial resolution [8]. To ensure that any sensor has sufficient amount of energy to guarantee a proper acquisition (in terms of, e.g., signal-to-noise ratio), one of the resolutions must be decreased allowing the other to be increased. For this reason, PAN images are generally characterized by higher spatial resolution and lower spectral resolution than MS or a HS images.

Optical images have been the most studied remote sensing modality for CD since the widely admitted additive Gaussian modeling of optical optical sensor noises allows CD techniques to be implemented through a simple operation of image differencing [4]. Originally designed for single-band images, CD differencing methods have been adapted to handle multi-band images by considering spectral change vectors [9] and transform analysis [11]. The possibility of detecting changes by exploiting both spatial and spectral information is one of the greatest advantages of these multi-band images. Nevertheless, images of same modality are not always available. In some specific scenarios, for instance consecutive to natural disasters, the availability of data imposes observation images acquired through different kind of sensors. Such disadvantageous emergency situations yet require fast, flexible and accurate methods able to handle also the incompatibilities introduced by the each sensor modality [13]. Most of the CD classical methods do not support differences in resolutions. Generally, each observed image is independently preprocessed in order to get the same resolution and then classical CD techniques are applied. However, independent resampling operations do not take into account the pair of observed images and even throw away important information. Recently, a general CD framework has been proposed in [17] to deal with multi-band images with different spatial and spectral resolutions based on a -step procedure (fusion, prediction, detection). Instead of independently preprocessing each observed image, this approach consists in recovering a latent (i.e., unobserved) HR-HS image containing changed and unchanged regions by fusing both observed images. Then, it predicts pseudo-observed images by artificially degrading the estimated HR-HS latent image using the same forward models underlying the actually observed images. As the pairs of predicted and observed observations have the same spatial and spectral resolutions, any classical multi-band CD method can be finally applied to build a change map. Albeit significantly improving detection performance when compared to crude methods relying on independent preprocessing, the -step sequential formulation appears to be non-optimal for the following twofold reasons: i) any inaccuracies in the fusion step are propagated throughout the subsequent degradation and detection steps, ii) relevant information regarding the change may be lost during the prediction steps, since it consists in spatially or spectrally degrading the latent images to estimate the pseudo-observed images. Thus, significant improvements in terms of change detection performance may be expected provided one is able to overcome both limitations.

In this paper, capitalizing on the general framework developed in [17], we show that the CD task can be formulated as a particular instance of the multi-band image fusion problem. However, contrary to the -step procedure in [17], the proposed approach jointly estimates a couple of distinct HR-HS latent images corresponding to the two acquisition times as well as the change image. Since the two HR-HS latent images are supposed to represent the same scene, they are expected to share a high level of similarity or, equivalently, to differ only in a few spatial locations. Thus, akin to numerous robust factorizing models such as robust principal component analysis [18] and robust nonnegative matrix factorization [19], the two observed images are jointly approximated by a standard linear decomposition model complemented with an HR-HS outlier term corresponding to the change image. This so-called CD-driven robust fusion of multi-band images is formulated as an inverse problem where, in particular, the outlier term is characterized by a spatial sparsity-inducing regularization. The resulting objective function is solved through the use of a block coordinate descent (BCD) algorithm, which iteratively optimizes w.r.t. one latent image and the change image. Remarkably, optimizing w.r.t. the latent image boils down to a classical multi-band image fusion step and can be efficiently conducted following the algorithmic solutions proposed in [20]. The CD map can be finally generated from the recovered HR-HS change image.

The paper is organized as follows. Section 2 formulates the change detection problem for multi-band optical image. Section 3 presents the solution for the formulated problem based on robust fusion. The simulation strategy as well as the results and considerations are present in Section 4.

2From change detection to robust fusion

2.1Generic forward model

Let us consider the image formation process as a sequence of transformations, denoted , of the original scene into an output image. The output image of a particular sensor is referred to as the observed image and denoted where and are the numbers of pixels and spectral bands in the observed image, respectively. It provides a limited version of the original scene with characteristics imposed by the image signal processor (ISP) characterizing the sensor. The original scene can be conveniently represented by an (unknown) latent image of higher spatial and spectral resolutions, , where and are the numbers of pixels and spectral bands, respectively, related to the observed image following

The intrinsic sequence of transformations of the sensor over the latent image can be typically classified as spectral or spatial degradations. On one hand, spatial degradations are related to the spatial characteristics of the sensor such as sampling scheme and optical transfer function. On the other hand, spectral degradations refer to the wavelength sensitivity and the spectral sampling. There are many ways to represent the degradation process. In this paper, is is considered as a sequence of linear operations leading to the following generic forward model [21]


  • is the spectral degradation matrix,

  • is the spatial degradation matrix,

  • is the additive term comprising sensor noise and modeling errors.

In , the left-multiplying matrix degrades the latent image by combination of some spectral bands for each pixel while the right-multiplying matrix degrades the latent image by linear combination of pixels within the same spectral band. The former degradation corresponds to a spectral resolution reduction with respect to the latent image as in [22]. In practice, this degradation models an intrinsic characteristic of the sensor, namely the spectral response. It can be either learned by cross-calibration or known a priori [23]. Conversely, the spatial degradation matrix models the combination of different transformations which are specific of the sensor architecture taking into account external factors including wrap, blurring, translation and decimation [24]. In this work, since geometrical transformations such as wrap and translations can be corrected using image co-registration techniques in pre-processing steps, only a spatially invariant blurring and a decimation (i.e., subsampling) will be considered. A space-invariant blur can be modeled by a symmetric convolution kernel associated with a sparse symmetric Toeplitz matrix which operates a cyclic convolution on the each individual band [26]. The decimation operation, denoted by the matrix , corresponds to a uniform downsampling operator1 of factor with ones on the block diagonal and zeros elsewhere, such that [20]. To summarize, the overall spatial degradation process corresponds to the matrix composition .

The noise corrupting multi-band optical images is generally modeled as additive and Gaussian [5]. Thus the noise matrix in is assumed to be distributed according to the following matrix normal distribution2

The row covariance matrix carries information regarding the between-band spectral correlation. Following [20], in what follows, this covariance matrix will be assumed to be diagonal, which implies that the noise is independent from one band to the other and characterized by a specific variance in each band. Conversely, the column covariance matrix models the noise correlation w.r.t. to the pixel locations. Following a widely admitted hypothesis of the literature, this matrix is assumed to be identity, , to reflect the fact the noise is spatially independent. In real applications, both matrices and can be estimated by previous calibrations [24].

2.2Problem statement

Let us denote and the acquisition times of two co-registered multi-band optical images. It is not assumed any specific information about time ordering, either or are possible cases. Hence, without loss of generality, the HR-PAN/MS image acquired at time is assumed to be a low spectral resolution (i.e., PAN or MS) image of high spatial resolution denoted . The image acquired at time is a LR-HS image denoted . The problem addressed in this paper consists of detecting significant changes between these two images. This is a challenging task mainly due to the spatial and spectral resolution dissimilarity which prevents any use of simple yet efficient differencing operation [4]. To alleviate this issue, this work proposes to generalize the CD framework introduced in [17]. More precisely, following the widely admitted forward model described in Section 2.1 and adopting consistent notations, the observed images and can be related to two HR-HS latent images and , respectively, as follows

Note that and are a specific double instance of . Indeed, the HR-PAN/MS (resp., LR-HS) image (resp., ) is assumed to be only a spectrally (resp., spatially) degraded version of the HR-HS latent image (resp., ) such that both latent images and share the same spectral and spatial resolutions which correspond to the highest resolutions of both observed images. Thereby, provided these two latent images can be efficiently inferred, any classical differencing technique can be subsequently implemented on them to detect changes, notably at a high spatial resolution. More specifically, it would consist of evaluating an HR-HS change image denoted that would gather information related to any change between the two observed images

where denotes the spectral change vector in the th pixel (). This spectral change image can be exploited by conducting a pixel-wise change vector analysis (CVA) [29] which exhibits the polar coordinates (i.e., magnitude and direction) of the spectral change vectors. To spatially locate the changes, a natural approach consists of monitoring the information contained in the magnitude part of this representation [30], by considering the corresponding HR spectral change energy image


When the CD problem in the th pixel is formulated as the binary hypothesis testing

the pixel-wise statistical test can be written for a given threshold as

The final binary HR CD map denoted can be derived as

When complementary information needs to be extracted from the change image , e.g., to identify different types of changes, the whole polar representation (i.e., both magnitude and direction) can be fully exploited [9]. As a consequence, to solve the multi-band image CD problem, the key issue lies in the joint estimation of the pair of HR-HS latent images from the forward model or, equivalently, the joint estimation of one of this latent image and the difference image, e.g., . The next paragraph shows that this problem can be formulated as a particular instance of multi-band image fusion.

2.3Robust multi-band image fusion

Linear forward models similar to have been extensively investigated in the image processing literature for various applications. When a unique LR-HS image has been observed at time , recovering the HR-HS latent image from the direct model can be cast as a superresolution problem [31]. Besides, when a complementary HR-PAN/MS image of lower spectral resolution (i.e., PAN or MS) has been simultaneously acquired at time under , the two corresponding latent images are expected to represent exactly the same scene, i.e., or, equivalently, where the time index can be omitted. In such scenario, estimating the common HR-HS latent image from the two observed images and is a multi-band image fusion problem addressed in [33], also referred to as MS or HS pansharpening in some specific cases [27]. Whether the problem consists in increasing the resolution of a single image or fusing multiple images of different spatial and spectral resolutions, the underlying objective consists in compensating the energy trade-off of optical sensors to get highly spatially and spectrally resolved images. Those problems are often formulated as an inverse problem, which is generally ill-posed or, at least, ill-conditioned. To overcome this issue, a classical approach consists of penalizing the data fitting terms derived from the linear models and the noise statistics with additional regularizing terms exploiting any prior information on the latent image. Various penalizations have been considered in the literature, including Tikhonov regularizations expressed in the image domain [36] or a in a transformed (e.g., gradient) domain [37], dictionary- or patch-based regularizations [31], total variation (TV) [39] or regularizations based on sparse wavelet representations [40].

In this work, we propose to follow a similar route by addressing, in a first step, the CD problem as a linear inverse problem derived from . However, the CD problem addressed here differs from the computational imaging problems discussed above by the fact that two distinct HR-HS latent images and need to be inferred, which makes the inverse problem highly ill-posed. However, this particular applicative scenario of CD yields a natural reparametrization where relevant prior knowledge can be conveniently exploited. More precisely, since the two HR-HS latent images are related to the same scene observed at two time instants, they are expected to share a high level of similarity, i.e., the change image is expected to be spatially sparse. Thus, instead of jointly estimating the pair of HR-HS latent images, we take benefit from this crucial information to rewrite the joint observation model as a function of , i.e.,

It is worthy to note that this dual observation model parametrized by the new pair of images to be inferred can be straightforwardly associated with a particular instance of the multi-band image fusion discussed earlier. Indeed, given the HR-HS change image and the HR-PAN/MS image observed at time , an HR-PAN/MS corrected image denoted that would be acquired by the HR-PAN/MS sensor at time can be defined as

In such case, the HR forward model can be easily rewritten, leading to

This observation model defines a standard multi-band image fusion problem for the LR-HS observed image and the corrected HR-PAN/MS image . Consequently, since the change image can be considered as an outlier term, akin to those encountered in several robust factorizing models such as robust principal component analysis (RPCA) [18] and robust nonnegative factorization [19] which relies on a similar sparse outlier term, the joint observation model naturally defines a so-called robust fusion scheme whose objective function is detailed in the next paragraph.

2.4Robust fusion objective function

Because of the additive nature and the statistical properties of the noise and , both observed images and can be assumed matrix normally distributed

Besides, since both observations are acquired by different modality sensors, the noise, which is sensor-dependent, can be assumed statistically independent. Thus, and are also statistically independent and the joint likelihood function can be written as a simple product of the conditional distributions and .

A Bayesian formulation of the robust multi-band image fusion problem allows prior information to be introduced to regularize the underlying estimation problem[42]. Bayesian estimators can be derived from the joint posterior distribution

where and correspond to the prior distributions associated with the latent and change HR-HS images, respectively, assumed to be a priori independent. Under a maximum a posteriori (MAP) paradigm, the joint MAP estimator can be derived by minimizing the negative log-posterior, leading to the following minimization problem


The regularizing functions and can be related to the negative log-prior distributions of the HR-HS latent and change images, respectively, and the parameters and tune the amount of corresponding penalizations in the overall objective function . These functions should be carefully designed to exploit any prior knowledge regarding the parameters of interest. As discussed in Section 2.3, numerous regularizations can be advocated for the HR-HS latent image . In this work, a Tikhonov regularization proposed in [21] has been adopted

where refers to a crude estimate of , e.g., resulting from a naive spatial interpolation of the observed LR-HS image . This choice has been proven to maintain computational efficiency while providing accurate results [27]. Additionally, a subspace-based representation can also be adopted to enforce to live in a previously identified subspace, as advocated in [43] and [23].

Conversely and more critically, a specific attention should be paid to the regularizing function . This function should reflect the fact that most of the pixels are expected to remain unchanged in and , i.e., most of the columns of the change image are expected to be null vectors. This noticeable property can be easily translated by promoting the sparsity of the spectral change energy image defined by . As a consequence, the regularizing function is chosen as the sparsity-inducing -norm of the change energy image or, equivalently, as the -norm of the change image

This regularization is a specific instance of the non-overlapping group-lasso penalization [44] which has been considered in various applications to promote structured sparsity [45].

The next section describes an iterative algorithm which solves the minimization problem in .

3Minimization algorithm

Computing the joint MAP estimator of the HR-HS latent image at time and of the change image can be achieved by solving the minimization problem in . However, no closed-form solution can be derived for this problem. Thus this section presents a minimization algorithm which iteratively converges to this solution. It consists in sequentially solving the problem w.r.t. to each individual variables and . This block coordinate descent algorithm is summarized in Algo. ? whose main steps (fusion and correction) are detailed in what follows.

3.1Fusion: optimization w.r.t

At the th iteration of the BCD algorithm, let assume that the current value of the HR-HS change image is denoted . As suggested in Section 2.3, an HR-PAN/MS corrected image that would be observed at time given the HR-PAN/MS image observed at time and the HR-HS change image can be introduced as

Updating the current value of the HR-HS latent image consists in minimizing w.r.t. the partial function

As noticed earlier, this sub-problem boils down to the multi-band image fusion which has received considerable attention in the recent image processing and remote sensing literature [23]. The two difficulties arising from this formulation lies in the high dimension of the optimization problem and in the fact that the sub-sampling operator prevents any fast resolution in the frequency domain by diagonalization of the spatial degradation matrix . However, with the particular choice of the regularization function adopted in this paper, a closed-form solution can still be derived and efficiently implemented. It consists in solving a matrix Sylvester equation [20] of the form

where the matrices , and depend on the quantities involved in the problem, i.e., the virtual and observed images, the degradation operators, the noise covariance matrices and the spatially interpolated image defined in (see [20] for more details). Note that when a more complex regularization function is considered (e.g., TV or sparse representation over a dictionary), iterative algorithmic strategies can be adopted to approximate the minimizer of .

3.2Correction: optimization w.r.t

Following the same strategy as in [17], let introduce the predicted HR-PAN/MS image

that would be observed at time index by the HR-PAN/MS sensor given its spectral response and the current state of the HR-HS latent image at the th iteration of the BCD algorithm. Similarly to , the predicted HR-PAN/MS change image can thus be defined as

The objective function w.r.t is then rewritten by combining and with , leading to

With the specific CD-driven choice of in , minimizing is an -penalized least square problem. It is characterized by the sum of a convex and differentiable data fitting term with -Lipschitz continuous gradient

and a convex but non-smooth penalization

Various algorithms have been proposed to solve such convex optimization problems including forward-backward splitting [51], Douglas-Rachford splitting [53] and alternating direction method of multipliers [54]. Since the proximal operator related to can be efficiently computed (see below), in this work, we propose to resort to an iterative forward-backward algorithm which has shown to provide the fastest yet reliable results. This algorithmic scheme is summarized in Algo. ?. It relies on a forward step which consists in conducting a gradient descent using the data-fitting function in , and a backward step relying on the proximal mapping associated with the penalizing function in .

Since the HR-PAN/MS observed image has only a few spectral bands (e.g., ), the spectral degradation matrix is a fat (and generally full-row rank) matrix. Thus, the corresponding gradient operator defining the forward step (see line ? of Algo. ?) can be easily and efficiently computed. Conversely, the proximal operator associated with in and required during the backward step (see line ? of Algo. ?) is defined as

for some . The function in can be split as with, for each column, . Based on the separability property of proximal operators [55], the operator can be decomposed and computed for each pixel location () as

where the notations stands for the th column. Thus, only the proximal operator associated with the Euclidean distance induced by the -norm is necessary. The Moreau decomposition [55]

establishes a relationship between the proximal operators of the function and its conjugate . When the function is a general norm, its conjugate corresponds to the indicator function into the ball defined by its dual norm [48], leading to

where denotes the projection. When is defined by , since the -norm is self-dual, this projection is

Consequently, replacing in , the proximal operator associated with the function in is

To conclude, the correction procedure can be interpreted as first a gradient descent step for spectral deblurring of the HR-HS change image from the HR-PAN/MS predicted change image (forward step), followed by a soft-thresholding of the resulting HR-HS change image to promote spatial sparsity (backward step).

Figure 1: \mathbf{d}_{\mathrm{HR}}
Figure 1:
Figure 2: \mathbf{X}^{t_i}
Figure 2:
Figure 3: \mathbf{X}^{t_j}
Figure 3:
Figure 4: \mathbf{Y}_{\mathrm{HR}}^{t_i}
Figure 4:
Figure 5: \mathbf{Y}_{\mathrm{LR}}^{t_j}
Figure 5:


4.1Simulation framework

Real dataset for assessing performance of CD algorithms is rarely available. Indeed, this assessment requires a couple of images acquired at two different dates, geometrically and radiometrically pre-corrected, presenting changes and, for the scenario considered in this paper, coming from two different optical sensors. To alleviate this issue, inspired by the well-known Wald’s evaluation protocol dedicated to pansharpening algorithms [56], a framework has been proposed in [17] to assess the performance of CD algorithms when dealing with optical images of different spatial and spectral resolutions. This framework only requires a single HR-HS reference image and generates a pair of latent HR-HS images and resulting from a unmixing-mixing process. This process allows synthetic yet realistic changes to be incorporated within one of these latent images, w.r.t. a pre-defined binary reference HR change mask locating the pixels affected by these changes and further used to assess the performance of the CD algorithms. This procedure allows various physically-inspired changes to be considered, e.g., by tuning the relative abundance of a each endmember or replacing one of them my another. This protocol is briefly described below (see [17] for more details).

Reference image

The HR-HS reference image used in the experiments reported in this paper is a HS image of the Pavia University, Italy, acquired by the reflective optics system imaging spectrometer (ROSIS) sensor. This image has undergone a pre-precessing to smooth the atmospheric effects of vapor water absorption by removing some bands. Thus the final HR-HS reference image is of size .

Generating the changes

Using the same procedure proposed in [17], the HR-HS reference image has been linearly unmixed to define the reference matrix of endmember spectral signatures and the corresponding reference abundance matrix such that . The two latent HR-HS images and are then computed as linear mixture of the endmembers in with corresponding abundance matrices and , respectively, derived from the reference abundances and the change mask , i.e.,


where the two change-inducing functions are defined to simulate realistic changes in some pixels of the HR-HS latent images. Three sets of predefined change masks have been designed according to three specific change rules introduced in [17]. For each simulated pair , one of the two functions is defined as a “no-change” operator, i.e., , which leads to an overall set of simulated pairs of HR-HS latent images.

Generating the observed images

The HR-PAN/MS observed image is obtained by spectrally degrading the corresponding HR-HS latent image . Two scenarios are considered. Scenario 1 consists in averaging the first bands of the HR-HS latent image to produce an HR-PAN image. Conversely, Scenario 2 considers an HR-MS image by spectrally filtering the HR-HS latent image with a -band LANDSAT-like spectral response. Moreover, to generate a spatially degraded image , the respective latent image has been blurred by a Gaussian kernel and subsequently equally down-sampled in the vertical and horizontal directions with a down-sampling ratio . To illustrate, Fig. ? shows one of the simulation configurations used during the experiments.

4.2Compared methods

The proposed robust fusion-based CD technique has been compared to four methods able to deal with optical images of different spatial and spectral resolutions. The first one has been proposed in [17] and also relies on a fusion-based approach. Up to the authors’ knowledge, it was the first operational CD technique able to operate with multi-band optical images of different spatial and spectral images. Contrary to the model proposed in this paper, it consists in recovering a common latent image by fusing the two observed images and then predicting an HR-PAN/MS image from the underlying forward model. An HR-PAN/MS change image has been then computed as in from the pair of HR-PAN/MS observed and predicted images . Finally, as recommended in [17], a spatially-regularized CVA (sCVA) similar to the decision rule detailed in Section 2.2 has been conducted on to produce an estimated HR CD mask denoted .

The second method aims at producing an HR-PAN/MS predicted image by successive spatial superresolution and spectral degradation. More precisely, an HR-HS latent image is first recovered by conducting a band-wise spatial superresolution of the observed LR-HS following the fast method in [32]. Then this latent image is spectrally degraded according to produce an HR-PAN/MS predicted image . Similarly to the previous fusion-based method, sCVA has been finally conducted on the pair to produce an HR CD mask denoted . The third CD method applies the same procedure with a reverse order of spatial superresolution and spectral degradation, and produces produces an HR change mask denoted from the pair of HR-PAN/MS images . The fourth CD method, referred to as the worst-case (WC) as in [17], build a LR change mask by crudely conducting a sCVA on a spatially degraded version of the HR-PAN/MS image and a spectrally degraded version of the LR-HS image.


The CD performances of these four methods, as well as the performance of the proposed robust fusion-based method whose HR change mask is denoted , have been visually assessed from empirical receiver operating characteristics (ROC), representing the estimated pixel-wise probability of detection () as a function of the probability of false alarm (). Moreover, two quantitative criteria derived from these ROC curves have been computed, namely, ) the area under the curve (AUC), corresponding to the integral of the ROC curve and ) the distance between the no detection point and the point at the interception of the ROC curve with the diagonal line defined by . For both metrics, greater the criterion, better the detection.

Figure 6: Scenario 1 (HR-PAN vs. LR-HS): ROC curves.
Figure 6: Scenario 1 (HR-PAN vs. LR-HS): ROC curves.
Scenarios 1 & 2: quantitative detection performance (AUC and distance).




Scenario 1 (HR-PAN vs. LR-HS)

The ROC curves depicted in Figure 6 with corresponding metrics in Table ? (first two rows) correspond to the CD results obtained from a pair of HR-PAN and LR-HS observed images. Clearly, the proposed robust fusion-based CD technique outperforms the four other CD techniques. More importantly, it provides almost perfect detections even for very low PFA, i.e., for very low energy changes. Note that the CD mask estimated by the worst-case method is defined at an LR.

Figure 7: Scenario 2 (HR-MS vs. LR-HS): ROC curves.
Figure 7: Scenario 2 (HR-MS vs. LR-HS): ROC curves.

Scenario 2 (HR-MS vs. LR-HS)

Applying the same procedure of Scenario 1 but now considering an HR-MS observed image instead of the HR-PAN observed image leads to very similar overall performance. The ROC plot is depicted in Figure 7 with corresponding metrics in Table ? (last two rows). As in Scenario 1, comparing curves in Figure 6 shows that the proposed method offers a higher precision even when analyzing a lower spectral resolution HR observed image.

As an additional result, for Scenario 2, Fig. ? compares the abilities of detecting changes of decreasing size of the proposed method against the fusion-based CD method [17] and the worst-case CD method. Figure 8 and Figure 9 shows the observed image pair and containing multiple changes with size varying from - pixel to -pixels, with the corresponding change mask presented in Figure 10. Figures Figure 11 and Figure 12 present the change masks and recovered by the two competing methods, respectively, while the CD mask recovered by the proposed robust fusion-based method is reported in Figure 13 shows the proposed CD. For each technique, the decision threshold required in the CVA in has been tuned to reach the higher distance value in the corresponding ROC curves. The first advantage of the proposed method is a significant decrease of the number of false alarm which are due to propagated errors when implementing the two other methods. Moreover, these results proves once again that the proposed method achieves a better detection rate with a higher resolution, even when considering extremely localized change regions. Remaining false alarms only occur near edges between change and no-change regions of small size due to the difference of spatial resolutions and the width of the blur kernel. Note also that the CD mask estimated by the worst-case method is of coarse scale since based on the comparison of two LR-MS images.

Figure 8: \mathbf{Y}_{\mathrm{HR}}^{t_i}
Figure 8:
Figure 9: \mathbf{Y}_{\mathrm{LR}}^{t_j}
Figure 9:
Figure 10: \mathbf{d}_{\mathrm{HR}}
Figure 10:
Figure 11: \hat{\mathbf{d}}_{\mathrm{F}}
Figure 11:
Figure 12: \hat{\mathbf{d}}_{\mathrm{WC}}
Figure 12:
Figure 13: \hat{\mathbf{d}}_{\mathrm{RF}}
Figure 13:


This paper proposed a robust fusion-based change detection technique to handle two multi-band optical observed images of different spatial and spectral resolutions. The technique was based on the definition of two high resolution hyperspectral latent images related to the observed images via a double physically-inspired forward model. The difference between these two latent images was assumed to be spatially sparse, implicitly locating the changes at a high resolution scale. Inferring these two latent images was formulated as an inverse problem which was solved within a -step iterative scheme. This algorithmic strategy amounted to solve a standard fusion problem and an -penalized spectral deblurring step. Contrary to the methods already proposed in the literature, modeling errors were not anymore propagate in-between steps. A simulation protocol allowed the performance of the proposed technique in terms of detection and precision to be assessed and compared with the performance of various algorithms. The detection rate as well as the accuracy of this method was clearly improved by this robust-fusion based algorithm. Future works include the detection of change between optical and non-optical data.


  1. The corresponding operator represents an upsampling transformation by zero-interpolation from to .
  2. The probability density function of a matrix normal distribution is given by [28]

    where is the mean matrix, is the row covariance matrix and is the column covariance matrix.


  1. V. Ferraris, N. Dobigeon, Q. Wei, and M. Chabert, “Change detection between multi-band images using a robust fusion-based approach,” in Proc. IEEE Int. Conf. Acoust., Speech and Signal Process. (ICASSP), 2016, submitted.
  2. C. Elachi and J. Van Zyl, Introduction to the physics and techniques of remote sensing, 2nd ed., ser. Wiley series in remote sensing.1em plus 0.5em minus 0.4emHoboken, N.J.: Wiley-Interscience, 2006.
  3. J. B. Campbell and R. H. Wynne, Introduction to remote sensing, 5th ed.1em plus 0.5em minus 0.4emNew York: Guilford Press, 2011.
  4. A. Singh, “Review Article Digital change detection techniques using remotely-sensed data,” Int. J. Remote Sens., vol. 10, no. 6, pp. 989–1003, June 1989.
  5. F. Bovolo and L. Bruzzone, “The time variable in data fusion: A change detection perspective,” IEEE Geosci. Remote Sens. Mag., vol. 3, no. 3, pp. 8–26, Sept. 2015.
  6. M. Dalla Mura, S. Prasad, F. Pacifici, P. Gamba, J. Chanussot, and J. A. Benediktsson, “Challenges and Opportunities of Multimodality and Data Fusion in Remote Sensing,” Proc. IEEE, vol. 103, no. 9, pp. 1585–1601, Sept. 2015.
  7. D. Landgrebe, “Hyperspectral image data analysis,” IEEE Signal Process. Mag., vol. 19, no. 1, pp. 17–28, 2002.
  8. J. C. Price, “Spectral band selection for visible-near infrared remote sensing: spectral-spatial resolution tradeoffs,” IEEE Trans. Geosci. Remote Sens., vol. 35, no. 5, pp. 1277–1285, 1997.
  9. F. Bovolo and L. Bruzzone, “A theoretical framework for unsupervised change detection based on change vector analysis in the polar domain,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 1, pp. 218–236, Jan. 2007.
  10. F. Bovolo, S. Marchesi, and L. Bruzzone, “A framework for automatic and unsupervised detection of multiple changes in multitemporal images,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 6, pp. 2196–2212, June 2012.
  11. A. A. Nielsen, K. Conradsen, and J. J. Simpson, “Multivariate alteration detection (MAD) and MAF postprocessing in multispectral, bitemporal image data: New approaches to change detection studies,” Remote Sens. Environment, vol. 64, no. 1, pp. 1–19, 1998.
  12. A. A. Nielsen, “The Regularized Iteratively Reweighted MAD Method for Change Detection in Multi- and Hyperspectral Data,” IEEE Trans. Image Process., vol. 16, no. 2, pp. 463–478, Feb. 2007.
  13. J. Inglada, “Similarity measures for multisensor remote sensing images,” in Proc. IEEE Int. Conf. Geosci. Remote Sens. (IGARSS), vol. 1.1em plus 0.5em minus 0.4emIEEE, 2002, pp. 104–106.
  14. V. Alberga, M. Idrissa, V. Lacroix, and J. Inglada, “Performance estimation of similarity measures of multi-sensor images for change detection applications,” in Proc. IEEE Int. Workshop Analysis Multitemporal Remote Sensing Images (MultiTemp).1em plus 0.5em minus 0.4em Leuven: IEEE, 2007, pp. 1 – 5.
  15. G. Mercier, G. Moser, and S. Serpico, “Conditional copula for change detection on heterogeneous sar data,” in Proc. IEEE Int. Conf. Geosci. Remote Sens. (IGARSS).1em plus 0.5em minus 0.4emIEEE, 2007, pp. 2394–2397.
  16. J. Prendes, M. Chabert, F. Pascal, A. Giros, and J.-Y. Tourneret, “A new multivariate statistical model for change detection in images acquired by homogeneous and heterogeneous sensors,” IEEE Trans. Image Process., vol. 24, no. 3, pp. 799–812, 2015.
  17. V. Ferraris, N. Dobigeon, Q. Wei, and M. Chabert, “Detecting changes between optical images of different spatial and spectral resolutions: a fusion-based approach,” 2016, submitted.
  18. E. J. Candés, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?” Journal of the ACM (JACM), vol. 58, no. 3, p. 11, 2011.
  19. C. Févotte and N. Dobigeon, “Nonlinear hyperspectral unmixing with robust nonnegative matrix factorization,” IEEE Trans. Image Process., vol. 24, no. 12, pp. 4810–4819, 2015.
  20. Q. Wei, N. Dobigeon, and J.-Y. Tourneret, “Fast Fusion of Multi-Band Images Based on Solving a Sylvester Equation,” IEEE Trans. Image Process., vol. 24, no. 11, pp. 4109–4121, Nov. 2015.
  21. ——, “Bayesian Fusion of Multi-Band Images,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 6, pp. 1117–1127, Sept. 2015.
  22. N. Yokoya, T. Yairi, and A. Iwasaki, “Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 2, pp. 528–537, Feb. 2012.
  23. M. Simões, J. Bioucas Dias, L. Almeida, and J. Chanussot, “A convex formulation for hyperspectral image superresolution via subspace-based regularization,” IEEE Trans. Geosci. Remote Sens., vol. 6, no. 53, pp. 3373–3388, June 2015.
  24. N. Yokoya, N. Mayumi, and A. Iwasaki, “Cross-Calibration for Data Fusion of EO-1/Hyperion and Terra/ASTER,” IEEE J. Sel. Topics Appl. Earth Observations Remote Sens., vol. 6, no. 2, pp. 419–426, April 2013.
  25. F. Heide, O. Gallo, M. Steinberger, J. Liu, Y.-T. Tsai, W. Heidrich, M. Rouf, K. Egiazarian, D. Pajak, J. Kautz, D. Reddy, and K. Pulli, “FlexISP: A Flexible Camera Image Processing Framework,” ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH Asia 2014, vol. 33, no. 6, 2014.
  26. Q. Wei, J. Bioucas-Dias, N. Dobigeon, and J.-Y. Tourneret, “Hyperspectral and multispectral image fusion based on a sparse representation,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 7, pp. 3658–3668, 2015.
  27. L. Loncan, L. B. de Almeida, J. M. Bioucas-Dias, X. Briottet, J. Chanussot, N. Dobigeon, S. Fabre, W. Liao, G. A. Licciardi, M. Simoes, J.-Y. Tourneret, M. A. Veganzones, G. Vivone, Q. Wei, and N. Yokoya, “Hyperspectral pansharpening: A review,” IEEE Geosci. Remote Sens. Mag., vol. 3, no. 3, pp. 27–46, Sept. 2015.
  28. A. K. Gupta and D. K. Nagar, Matrix Variate Distribution, ser. Monographs and Surveys in Pure and Applied Mathematics.1em plus 0.5em minus 0.4emChapman and Hall, 1999, no. 104.
  29. R. D. Johnson and E. S. Kasischke, “Change vector analysis: A technique for the multispectral monitoring of land cover and condition,” Int. J. Remote Sens., vol. 19, no. 3, pp. 411–426, Jan. 1998.
  30. A. Singh, “Digital change detection techniques using remotely-sensed data,” Int. J. Remote Sens., vol. 10, no. 6, pp. 989–1003, 1989.
  31. Jianchao Yang, J. Wright, T. S. Huang, and Yi Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Process., vol. 19, no. 11, pp. 2861–2873, Nov. 2010.
  32. N. Zhao, Q. Wei, A. Basarab, N. Dobigeon, D. Kouame, and J.-Y. Tourneret, “Fast Single Image Super-Resolution Using a New Analytical Solution for Problems,” IEEE Trans. Image Process., vol. 25, no. 8, pp. 3683–3697, Aug. 2016.
  33. R. C. Hardie, M. T. Eismann, and G. L. Wilson, “MAP estimation for hyperspectral image resolution enhancement using an auxiliary sensor,” IEEE Trans. Image Process., vol. 13, no. 9, pp. 1174–1184, Sept. 2004.
  34. M. T. Eismann and R. C. Hardie, “Hyperspectral resolution enhancement using high-resolution multispectral imagery with arbitrary response functions,” IEEE Trans. Image Process., vol. 43, no. 3, pp. 455–465, March 2005.
  35. Y. Zhang, S. De Backer, and P. Scheunders, “Noise-resistant wavelet-based Bayesian fusion of multispectral and hyperspectral images,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11, pp. 3834–3843, Nov. 2009.
  36. M. Ebrahimi and E. R. Vrscay, “Regularization schemes involving self-similarity in imaging inverse problems,” in J. Physics: Conf. Series, 2008.
  37. Y.-W. Tai, S. Liu, M. S. Brown, and S. Lin, “Super resolution using edge prior and single image detail synthesis,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 2400 – 2407.
  38. J. Sun, J. Sun, Z. Xu, and H.-Y. Shum, “Gradient profile prior and its applications in image super-resolution and enhancement,” IEEE Trans. Image Process., vol. 20, no. 6, pp. 1529 – 1542, 2011.
  39. H. A. Aly and E. Dubois, “Image up-sampling using total-variation regularization with a new observation model,” IEEE Trans. Image Process., vol. 14, no. 10, pp. 1647–1659, 2005.
  40. C. V. Jiji, M. V. Joshi, and S. Chaudhuri, “Single-frame image super-resolution using learned wavelet coefficients,” Int. J. Imaging Syst. Technol., vol. 14, pp. 105–112, 2004.
  41. J. M. Bioucas-Dias, “Bayesian wavelet-based image deconvolution: A GEM algorithm exploiting a class of heavy-tailed priors,” IEEE Trans. Image Process., vol. 15, no. 4, pp. 937–951, 2006.
  42. J. Idier, Bayesian approach to inverse problems, J. Idier, Ed.1em plus 0.5em minus 0.4emISTE ; Wiley, 2008.
  43. Q. Wei, N. Dobigeon, and J.-Y. Tourneret, “Bayesian fusion of multispectral and hyperspectral images using a block coordinate descent method,” in Proc. IEEE GRSS Workshop Hyperspectral Image SIgnal Process.: Evolution in Remote Sens. (WHISPERS), 2015.
  44. F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Convex Optimization with Sparsity-Inducing Norms, ser. Optimization for Machine Learning.1em plus 0.5em minus 0.4emMIT Press, 2011.
  45. S. Cotter, B. Rao, Kjersti Engan, and K. Kreutz-Delgado, “Sparse solutions to linear inverse problems with multiple measurement vectors,” IEEE Trans. Signal Process., vol. 53, no. 7, pp. 2477–2488, July 2005.
  46. C. Ding, D. Zhou, X. He, and H. Zha, “R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization,” in Proc. Int. Conf. Machine Learning (ICML), 2006, pp. 281–288.
  47. J. Liu, S. Ji, and J. Ye, “Multi-task feature learning via efficient L2,1-norm minimization,” Uncertainty in Artificial Intelligence, 2009.
  48. S. Wright, R. Nowak, and M. Figueiredo, “Sparse Reconstruction by Separable Approximation,” IEEE Trans. Signal Process., vol. 57, no. 7, pp. 2479–2493, July 2009.
  49. F. Nie, H. Huang, X. Cai, and C. H. Ding, “Efficient and robust feature selection via joint l2, 1-norms minimization,” in Advances in neural information processing systems, 2010, pp. 1813–1821.
  50. H. Lu, X. Long, and J. Lv, “A fast algorithm for recovery of jointly sparse vectors based on the alternating direction methods,” in International Conference on Artificial Intelligence and Statistics, 2011, pp. 461–469.
  51. P. L. Combettes and V. R. Wajs, “Signal recovery by proximal forward-backward splitting,” Multiscale Modeling & Simulation, vol. 4, no. 4, pp. 1168–1200, 2005.
  52. P. L. Combettes and J.-C. Pesquet, “Proximal Splitting Methods in Signal Processing,” Fixed-Point Algorithm for Inverse Problems in Science and Engineering, no. Springer Optimization and Its Applications, 2011.
  53. ——, “A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery,” IEEE J. Sel. Topics Signal Process., vol. 1, no. 4, pp. 564–574, 2007.
  54. S. Boyd, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010.
  55. N. Parikh and S. Boyd, “Proximal Algorithms,” Foundations and Trends in Optimization, vol. 1, no. 3, pp. 123–231, 2013.
  56. L. Wald, T. Ranchin, and M. Mangolini, “Fusion of satellite images of different spatial resolutions: assessing the quality of resulting images,” Photogrammetric engineering and remote sensing, vol. 63, no. 6, pp. 691–699, 1997.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description