Semiblind Hyperspectral Unmixing in the Presence of Spectral Library Mismatches ^{1}^{1}1Part of this work was published in WHISPERS 2014 [1].
Abstract
The dictionaryaided sparse regression (SR) approach has recently emerged as a promising alternative to hyperspectral unmixing (HU) in remote sensing. By using an available spectral library as a dictionary, the SR approach identifies the underlying materials in a given hyperspectral image by selecting a small subset of spectral samples in the dictionary to represent the whole image. A drawback with the current SR developments is that an actual spectral signature in the scene is often assumed to have zero mismatch with its corresponding dictionary sample, and such an assumption is considered too ideal in practice. In this paper, we tackle the spectral signature mismatch problem by proposing a dictionaryadjusted nonconvex sparsityencouraging regression (DANSER) framework. The main idea is to incorporate dictionary correcting variables in an SR formulation. A simple and low periteration complexity algorithm is tailordesigned for practical realization of DANSER. Using the same dictionary correcting idea, we also propose a robust subspace solution for dictionary pruning. Extensive simulations and realdata experiments show that the proposed method is effective in mitigating the undesirable spectral signature mismatch effects.
1 Introduction
Hyperspectral unmixing (HU) aims at decomposing pixels of an hyperspectral image (HSI) into constituent spectra that represent some pure materials. HU is useful in a number of applications, such as environment surveillance, agriculture, mine detection, and food and medicine analytics. As one of the core developments in signal and image processing for HSIs, various HU algorithms have been developed in the past two decades from different perspectives, such as Bayesian inference, nonnegative matrix factorization, convex analysis, pure pixel pursuit, and many more; see, e.g., [2, 3] for some overviews.
Recently, a class of HU algorithms based on spectral libraries has attracted much attention. A spectral library is a collection of spectral signatures of materials acquired in controlled or ideal environments, e.g., in laboratories. There are several publicly available libraries, provided by government agencies and research institutes. For example, the U.S. Geological Survey (U.S.G.S.) library [4] contains remotely sensed and extracted spectral signatures of over materials. Such rich knowledge of materials’ spectra in the existing libraries provides new opportunities for HU. By using an existing library as a dictionary, and by assuming the linear mixture model, we can treat HU as a problem of selecting a small number of spectra from the dictionary to represent all the pixels. Such a dictionaryaided semiblind formulation is fundamentally identical to the wellknown basis selection or sparse regression problem in compressive sensing (CS), and thus many welldeveloped tools from CS can be applied. Fundamentally, there are several advantages with dictionaryaided semiblind HU. First, unlike many blind HU approaches (which do not use dictionaries), dictionaryaided methods do not require assumptions such as the pure pixel assumption and the sumtoone abundance conditions. Second, dictionaryaided methods may not require knowledge of the number of materials contained in the HSIs.
Several dictionaryaided HU algorithms based on sparse regression were proposed in [5, 6, 7, 8]. The algorithms in [5, 6] and [7] treat the HU problem as a single pixelbased sparse regression (SR) problem and a multiple pixelbased collaborative sparse regression (CSR) problem, respectively. Classic norm and mixednorm minimizationbased sparse optimization methods are employed to tackle the formulated problems there. The corresponding optimization problems are convex, and thus can be solved efficiently, e.g., by some specialized alternating direction method of multipliers (ADMM)based algorithms. Three main difficulties have been observed when applying the algorithms in [5, 6, 7], however: First, the spectral library members (i.e., the recorded material spectra) exhibit very high mutual coherence. As is known in CS [9, 10, 11, 12], high mutual coherence may lead to poor performance when applying norm and mixednorm minimizationbased sparse optimization. Second, the size of a spectral library is often very large. Consequently, we are faced with a largescale problem, for which computational efficiency becomes an issue. Third, there may be mismatches between the actual spectral signatures in the scene and the dictionary samples, due to various reasons. Such dictionary mismatches affect the performance of a dictionaryaided semiblind HU to an extent which depends on the severity of the mismatches.
The first two difficulties mentioned above have been tackled by employing a dictionary pruning method based on multiple signal classification (MUSIC) [8]. MUSIC is a classical subspace method in sensor array processing [13], and recently finds its application in CS [14]. In dictionaryaided semiblind HU, MUSIC proves to be useful in preselecting some relevant spectra from a large spectral library. As a result, a sizereduced dictionary can be constructed for the SR and CSR algorithms to perform semiblind HU. After dictionary pruning, both the mutual coherence of the dictionary and the complexity of the subsequent semiblind HU algorithm can be reduced.
However, the third difficulty, spectral signature mismatches, is still not addressed. In practice, the mismatch problem arises for several reasons. First, the materials’ spectra may vary from time to time, and from site to site, subject to diverse physical conditions, e.g. strength of sunlight and temperature [15]. Second, the calibration procedure for spectral signatures may introduce errors. Third, the spatial resolutions of spectra in the dictionary can be different from those of the image, and that can also result in modeling errors. Spectral mismatches can be rather damaging to the existing semiblind HU algorithms; particularly, MUSICbased dictionary pruning is sensitive to spectral signature mismatches, as will be seen in the simulations.
Contributions In this work, we propose a dictionaryaided HU framework that takes spectral signature mismatches into consideration. Our first contribution lies in developing a new dictionaryaided HU algorithm. The formulation leading to the new algorithm uses insights of CSR, but has two key differences: 1) We model spectral signature mismatches as bounded error vectors, and attempt to compensate those errors in the formulation. 2) We employ the nonconvex () quasinorm as the sparsitypromoting function, instead of the convex mixed norm as in CSR [7]. The second endeavor is motivated by the fact that quasinorm based sparse optimization has been demonstrated to exhibit better sparsity promoting performance in certain difficult situations, e.g., the highcoherence dictionary case [16, 17, 18]. Since our formulation considers dictionary adjustment, it is more complicated to handle than the previous CSR work. We derive the new algorithm by careful design of alternating optimization, and its upshot is that the solution update at each iteration involves simple matrix operations.
The second contribution is a spectral mismatchrobust solution to dictionary pruning. We give a robust MUSIC formulation, wherein the goal is to identify spectral signature samples that are close to the true materials’ signatures, rather than being exactly equal. At first look, the robust MUSIC method seems to be computationally expensive under large dictionary sizes; specifically, for every dictionary sample, we need to solve an optimization problem. We show that, however, the optimization problem in robust MUSIC can be converted to a singlevariable optimization problem, thereby being solved with a very low computational cost. Simulations and real data experiment are used to show the effectiveness of the proposed algorithm.
Related Works: While the topic of CS and sparse regression has received enormous attention in various fields, there are comparatively fewer works that study sparse regression in the presence of dictionary mismatches. Those works usually appear in signal processing, and the application is not HU. In [19], perturbations of dictionaries were modeled as Gaussian noise, and an norm regularized total least squares criterion was proposed; there, the focus was the singlemeasurement vector case (or the singlepixel case in our problem), and constraints on the unknowns were not considered. In [20, 21], dictionary perturbations were modeled as scaling factors on each dictionary atom, and the formulated problem is convex. The algorithm in [22] attacked the dictionary mismatch problem in CSRbased directionofarrival finding. There, the mismatch was characterized by a subspace of a structured matrix, and the optimization surrogates there are also convex norm and its smoothed counterparts. We also note that quasinorm based sparse regression was applied to HU for single pixelbased unmixing without considering dictionary mismatches [23]. Here, our focus is collaborative sparse regression using multiple pixels, which is known to have both theoretical and practical advantages over the single pixelbased algorithms; we adopt the nonconvex quasinorm, where , as our sparsitypromoting function, since it has proven to show better performance in various applications; and we model spectral mismatches as deterministic bounded errors, which does not require statistical assumptions and may be more flexible.
Notation: The notations and mean that and are a realvalued dimensional vector and a realvalued matrix, respectively (resp.). The notation (resp. ) means that (resp. ) is elementwise nonnegative. The th column of a matrix is denoted by , and the th row of is denoted by . The superscript “” and “” stand for the transpose and inverse operations, resp. The orthogonal projector of the range space of is denoted by , where the superscript “” stands for the pseudoinverse; and the corresponding orthogonal complement projector is denoted by . The norm of a vector , , is denoted by . The quasinorm, , is denoted by the same notation as above. The mixed norm or quasinorm is denoted by . The Frobenious norm is denoted by .
2 Background
2.1 Signal Model and DictionaryAided Semiblind HU
Consider a remotely sensed scene that is composed of mixtures of different materials. Assuming linear mixtures, the measured hyperspectral image can be modeled as
(1) 
where denotes the hyperspectral measurement at the th pixel of the image, with being the number of spectral bands; each , , represents the spectral signature of a particular material, indexed by here; is the abundance of material at pixel ; is a noise vector; and is the number of pixels. For convenience, we will write (1) in a matrix form
(2) 
where , , , , and .
In HU, we aim by identifying and from . This amounts to a blind separation problem where hyperspectral signalspecific properties—such as pure pixel and sumtoone abundance conditions—are often utilized to attack the problem in many existing and concurrent HU studies. Dictionaryaided semiblind HU takes a different strategy. Motivated by the fact that many spectral libraries (e.g., the U.S.G.S. library [4]) have been built up in the past decades, its principle is to use one such spectral library as a dictionary to infer what are the underlying spectral signatures, and hence materials, in the scene. To put this into context, define
as a spectral dictionary, where each is a previously recorded spectral sample for a specific material, and denotes the dictionary size or the number of spectral samples. A dictionary often contains a wide variety of samples of materials, and as such is large. The key assumption with dictionaryaided semiblind HU is that the dictionary covers the spectral signatures of all materials in the scene; that is to say,
Alternatively, we can write, for each ,
(3) 
Consequently, the signal model in (2) can be written as
(4) 
where is a rowsparse matrix; to be specific, the th row of , , is the th row of , and the other rows of are all zeros.
Let us consider the sparse regression approach—currently the main approach for dictionaryaided semiblind HU. The idea is to exploit the sparsity of , thereby attempting to recover the indices correctly and the abundance matrix accurately. There is more than one way to formulate such a sparse promoting problem (see, e.g., [2, 3] and the references therein), and here we are interested in the CSR formulation [7, 8]. The CSR formulation is given as follows:
(5)  
for some prespecified constant . Here, notice that , which aims at promoting row sparsity of . As can be seen in Problem (5), CSR seeks to find a nonnegative rowsparse that provides a good approximation to . Problem (5) is convex, and a fast algorithm based on ADMM has been derived for Problem (5) [7].
2.2 Dictionary Pruning using the Subspace Approach
As discussed in the Introduction, large dictionary size and high mutual coherence with the dictionary are two main difficulties encountered in CSR and other sparse regression methods, and these two difficulties may be circumvented by applying dictionary pruning. Here, we are interested in a subspacebased dictionary pruning method called MUSIC [8]. This subspace method may be best described by studying the noiseless case . Let denote a matrix that contains the first left singular vectors of . It can be shown that in the noiseless case and under some mild assumptions^{2}^{2}2Specifically, we require that has full row rank, and that , where means that any columns of are linearly independent. Intuitively, these requirements mean that the abundance maps of the different materials are sufficiently different, and that any spectral samples in the dictionary are sufficiently different., we have
(6) 
The physical meaning of (6) is that if a spectral sample in the dictionary is also one of the spectral signatures in the scene, then it must be perpendicular to the orthogonal complement signal subspace. Also, the converse is true. From an algorithm viewpoint, the above observation suggests that we can correctly identify the indices by the simple closedform equations at the lefthand side (LHS) of (6)—at least in the noiseless case.
In practice, where noise is present, the LHS of (6) may not be exactly allzero. Under such circumstances, the following procedure can be used to estimate :

For , calculate
(7) 
Determine such that for , we have for all .
The above procedure is known as MUSIC [8, 14]. Also, note that we may use some other hyperspectral subspace identification algorithms, e.g., HySiMe [24], to estimate the signal subspace matrix from the noisy . MUSIC can in principle be used to perform dictionarybased semiblind HU. However, because of its sensitivity to colored noise and modeling error that are usually present in real data, it is used as a preprocessing algorithm for CSR (or other sparse regression methods) in practice. Specifically, MUSIC is used to discard a large number of spectral samples that yield large residuals . The remaining spectral samples then form a (much) smaller dictionary for CSR to operate. Such a dictionary pruning procedure has been found to be able to improve the HU performance and speed up the process quite significantly—see [8] for the detail.
3 Proposed Approach
The crucial assumption with dictionaryaided semiblind HU is that there is no spectrum mismatches; that is, we can always find a dictionary sample that exactly matches an actual spectral signature in the scene; cf. Eq. (3). As discussed in the Introduction, this may be not the case in reality. In this section, we will propose a dictionaryaided semiblind HU that takes into account the presence of spectrum mismatches.
3.1 DictionaryAdjusted Nonconvex SparsityEncouraging Regression (DANSER)
We assume the following spectrum mismatch model in place of (3):
(8) 
for some that characterizes the mismatch between the presumed and actual spectra of each material. Particularly, every spectral error is assumed to be bounded:
for some . Physically, our model assumes that the dictionary still covers all the actual spectral signatures in the scene, but their “best matched” spectral samples in the dictionary are subject to certain perturbations. Also, such perturbations do not go worse than in terms of magnitude.
Our rationale is to adjust the dictionary in the CSR formulation. Specifically, we write
where each is a dictionary correction variable and we assume . Following the CSR formulation in (5), we propose a new formulation as follows:
(9)  
where , and are prespecified, and note that . Comparing the original CSR formulation in (5) and the above formulation, we see two differences. First, Problem (9) adjusts the dictionary to attempt to neutralize the spectrum mismatches. Second, Problem (9) employs a nonconvex rowsparsity promoting function . The reason is that nonconvex quasinorms may exhibit better sparsity promoting performance than the norm, as reported in the sparse optimization context [18, 16, 25], and we endeavor to explore such an opportunity to improve sparse regression performance in the HU application. The formulation in (9) or its variants will be called dictionaryadjusted nonconvex sparsityencouraging regression (DANSER) in the sequel.
3.2 An Efficient Algorithm for DANSER
Having expressed the DANSER formulation in the last subsection, we turn our attention to algorithm design for DANSER. A simple approach to handle DANSER is to apply alternating optimization: fix and optimize Problem (9) with respect to (w.r.t.) at one time, fix and optimize Problem (9) w.r.t. at another time, and repeat the above cycle until some stopping criterion holds. While this approach is doable, our algorithm design experience is that it can lead to a computationally expensive algorithm. For instance, the optimization of Problem (9) w.r.t. involves joint adjustment of all the dictionary samples in an inseparable manner, which is computationally involved for large dictionary sizes. Also, the nonconvex rowsparsity promoting function used in Problem (9) introduces difficulties in the optimization of Problem (9) w.r.t. .
In view of the aforementioned issues, we formulate a modified version of Problem (9):
(10)  
where , and is a slack variable. In particular, it can be verified that if and , then Problem (10) and Problem (9) are essentially the same. It should be noted that we have applied the variable splitting technique in Problem (10) (specifically, to the variable ), which is a commonly used trick in contexts such as image reconstruction [26, 27, 28].
The modified DANSER formulation in (10) can be handled in a low periteration complexity fashion. To describe it, let us first consider the following lemma [29, 30, 31]:
Lemma 1
Let , where , . The function is strictly convex on . Also, satisfies the following identity
and the solution to the problem above is uniquely given by
(11) 
By Lemma 1, Problem (10) can be equivalently expressed as
(12)  
Now, our strategy is to perform alternating optimization w.r.t. , , , . As we will see soon, the merit of doing so is that every update admits a computationally light solution.
First, we examine the optimization w.r.t. . One can easily see that the solution is
(13) 
Second, the optimization w.r.t. is separable w.r.t. , i.e., for , we have
(14)  
Problem (14) is a projection problem and the solution is
(15) 
Third, to check the solution w.r.t. , let us first rewrite the optimization w.r.t. as
where
and . Then, the subproblem w.r.t. can be expressed as
(16)  
where
where is the th column of the identity matrix. Problem (16) is known to have a simple solution [32, 33], given by
(17) 
where . Notice that using the update (1) is desirable: the large matrix products and both only need to be calculated once before updating . Finally, by Lemma 1, the solution w.r.t. is
(18) 
The alternating optimization process described above is summarized in Algorithm 1, and we simply call it DANSER. The DANSER algorithm has the following solution convergence guarantee.
Proposition 1
The proof of the above proposition is relegated to Appendix A. Proposition 1 indicates that, although we have been dealing with Problem (10) indirectly, a stationary point of Problem (10) may be expected. Following Proposition 1, we can stop DANSER by checking the relative or absolute change of the solution . Notice that since Problem (10) is nonconvex, a good initialization would help DANSER converge to a better solution. In practice, one can use the CSR solution mentioned in Section II.A to initialize DANSER.
Remark 1
By analyzing the periteration complexity of DANSER, one can verify that the complexities of many operations scale with (i.e., the size of the dictionary) or higher. For example, to solve (13), the operations and cost and flops, respectively; and the matrix inversion requires flops. Plus, although solving the problems w.r.t. is easy, these procedures have to be repeated times at each iteration. Practically, it is therefore motivated to use a dictionary with a smaller size, or, to prune the dictionary in advance. However, due to the existence of spectral signature mismatches, directly applying MUSIC as in [8] for this purpose is not appropriate any more. To address this problem, a robust dictionary pruning method will be proposed in the next subsection.
3.3 Robust MUSIC for Dictionary Pruning
Consider the MUSIC procedure back in Section II.B. In particular, recall that the metric
(19) 
should yield a small value when exactly matches an actual spectral signature in the scene, and this property has been used as the way to prune the dictionary in the MUSIC procedure. Now, in the presence of dictionary mismatches, we propose to replace (19) by the following robust MUSIC (RMUSIC) metric
(20a)  
(20b) 
where is prespecified. The idea is the same as the DANSER development in the above subsections—adjust the dictionary to find a better match, this time in a subspace sense.
The key issue with realizing RMUSIC lies in solving Problem (20). Problem (20) is a singleratio fractional quadratic program, which is quasiconvex and can be solved, e.g., by the Dinkelbach algorithm or its variants [34, 35]. While this means that we can implement RMUSIC by applying some existing optimization algorithms, we have to solve such quasiconvex problems—which is still inefficient for large . However, by carefully examining the problem structure, we find that this particular problem can be solved quite easily. To see this, let us reexpress as
(21)  
where denotes the orthogonal projector of , and
(22) 
Since the objective function of (21) is a monotone increasing function of , computing is the same as finding the minimal value of subject to . Let us denote
(23) 
We show that
Proposition 2
The optimal value of Problem (23) can be found by solving a singlevariable problem
(24) 
The proof of Proposition 2 is relegated to Appendix B. The message revealed here is quite intriguing—the originally quasiconvex problem can be recast into a simple singlevariable problem that can be easily solved, e.g., by grid search or bisection. Practically, this means that the RMUSIC strategy can be implemented quite efficiently.
As in the previous MUSIC work [8], we use RMUSIC to perform dictionary pruning for DANSER. Specifically, we use RMUSIC to select a number of () spectral samples from , form a size dictionary, denoted by here, and then use as a pruned dictionary to run DANSER. We summarize this procedure in Algorithm 2, and we call the procedure RMUSICDANSER.
4 Computer Simulations
In this section, we use synthetic hyperspectral images to show the effectiveness of the proposed approach. In our simulations, the groundtruth spectra are randomly selected from a subset of the U.S.G.S. library that has spectral signatures. The abundances are generated following the uniform Dirichlet distribution. Throughout this section, we set the number of pixels to be . The ‘available dictionary’, , is formed by the same subset of spectra, but a perturbation (i.e., for ) is intentionally added to each spectrum. To quantify the ‘mismatch level’ of the available dictionary, we define the dictionary to modeling error ratio (DMER) as follows:
where and . The perturbation term follows the zeromean i.i.d. Gaussian distribution and is scaled to satisfy DMER specifications. We also define the signaltonoise ratio (SNR) by to quantify the noise level, where denotes the variance of the additive noise, which is also assumed to be zeromean i.i.d. Gaussian. The choice of the parameter is as follows
where is given. The parameter controls the correlation between the RMUSIC/DANSERresulted dictionary member and the original one. Specifically, under , it can be shown that the above choice of leads to .
Figs. 12 show an illustrative example. Fig. 1 shows the residues of applying MUSIC and RMUSIC to prune the dictionary . Here, we randomly pick spectra as the groundtruth materials, and then use the described dictionary to observe the performance of MUSIC and RMUSIC. The parameter of RMUSIC is set to be , and we set dB and dB in this case. We see that MUSIC has difficulty in distinguishing several groundtruth spectra from the other dictionary members (to be precise, the third and the fourth materials’ spectra), but RMUSIC can clearly differentiate the same spectra from the irrelevant spectra. Fig. 2 compares the unmixing performance of DANSER and CSR using the same case, where the pruned dictionary with spectra is obtained by RMUSIC. Here, the CSR part is performed by the CLSUnSAL algorithm [7], which is considered as a stateoftheart. For DANSER, we set , for this case. For CSR, the regularization parameter is . In this example and the forthcoming simulations and real experiment, we feed the solution of CSR to DANSER as initialization. We see that RMUSICDANSER yields much rowsparser than that of RMUSICCSR, and all of the desired spectra have been successively identified by DANSER.
In the following, we use Monte Carlo simulations to evaluate the performance of the proposed algorithms. Two performance discriminators will be used throughout this section. First, to measure the dictionary pruning performance, we define the following detection probability
where denotes the index set that indicates the groundtrue spectra, and denotes an index selection subset outputted by a dictionary pruning algorithm. Also, we will use to denote the size of the pruned dictionary. Second, to measure the unmixing performance, we calculate the following the signal to reconstruction error (SRE) [5, 6, 7]:
where is the true rowsparse abundance matrix (see (2) in Section II.A), and is the output of an unmixing algorithm.
In Fig. 3, we show the index set detection probabilities of MUSIC and RMUSIC under various DMERs. In each trial, materials are randomly picked. The in this simulation is set to be dB, and is employed. The results are averaged from trials. One can see that MUSIC is sensitive to dictionary mismatches even under high DMERs, and MUSIC is not able to identify all the true materials from the dictionary. Generally, using RMUSIC with and both yield much better detection probabilities than MUSIC under all DMERs. Interestingly, one can see that RMUSIC with admits very good detection probabilities when DMERdB; however, when the DMER is higher, using a small leads to a slight performance degradation. The reason is that a smaller implies that one is allowed to adjust ’s more significantly in RMUSIC. Hence, several similar ’s may be confused with each other. This observation suggests that a more conservative choice of should be safer for implementing RMUSIC in practice.
Fig. 4 shows the detection probabilities of RMUSIC and MUSIC under different ’s (the size of the pruned dictionary). Setting to be small may be easier for the sparse regression stage, but is considered more aggressive—some spectra corresponding to the groundtruth materials may also be discarded. We see that when DMERdB, RMUSIC with yields higher detection probabilities than that of MUSIC with , and that RMUSIC with a larger has a better detection performance.
Fig. 5 and Fig. 6 show the performance of RMUSIC under various SNRs and underlying groundtruth materials, respectively. From these figures, one can see how this algorithm is scaled by different parameters.
Beginning from Fig. 7, we show the SRE performance of the CSRbased HU algorithms. Specifically, we compare the SREs yielded by the proposed RMUSICDANSER and by MUSICCSR [8]. We also benchmark our algorithm using RMUSICCSR for fairness, since we now have seen that RMUSIC yields much better dictionary pruning performance. In all the following simulations, we fix , , for DANSER, no matter how the simulation settings change; the sparsitycontrolling parameter for DANSER and CSR are also fixed to be and except specified. We stop DANSER if , where denotes the solution at iteration , or if the number of iterations reaches . The results in all the following figures of this section are averaged from independent trials.
Fig. 7 shows the SREs of the algorithms under different DMERs. We see that under all DMERs, RMUSICDANSER yields the highest SREs. We see that RMUSICCSR also consistently yields better SRE performance than that of MUSICCSR—this suggests that RMUSIC itself can help improve the sparse unmixing performance. The runtime performance of DANSER and CSR (i.e., CLSUnSAL) is shown in Table 1 as a reference. We see that DANSER requires more time to converge compared to CSR, since it also adjusts the dictionary during its updates. Also, when the DMER gets higher, the convergence speed of DANSER improves by . This intuitively suggests that DANSER does put much effort on adjusting the dictionary (i.e., updating ) when the DMER is low.
Algorithm  DMER (dB)  

15  20  25  30  35  40  
DANSER  15.9205  16.2639  13.0023  11.0784  10.7352  9.9090 
CSR  0.8687  0.9123  1.3611  1.3472  1.6298  1.4699 
Fig. 8 and Fig. 9 show the performance of the algorithms under different number of materials and SNRs, respectively. We see that the results are similar to that in Fig. 7 — the SRE performance of RMUSICDANSER is consistently higher than the other two under comparison. Notice that for the SNRdB case, we change of DANSER and CSR to be and , respectively, to accommodate the situation where the data is more severely corrupted.
Fig. 10 shows the SREs of the algorithms under different values of . An interesting observation is that using yields much better unmixing performance than using . This results may shed some light on choosing in practice  using a large may safely capture all the true materials in the pruned dictionary, but it may also degrade the unmixing performance since the sparse regressiontype algorithms are in general in favor of smaller .
5 Real Data Experiment
In this section, we test the algorithms on the famous AVIRIS Cuprite data set which was captured in Nevada, 1997 (see http://aviris.jpl.nasa.gov/html/aviris.freedata.html). This data set has been studied for years and the abundance maps of several prominent materials are well recognized. The scene originally has 224 spectral bands between 0.4 and 2.5 Âµm, with nominal spectral resolution of 10 nm. Low SNR bands, i.e., bands 1–2, 105–115, 150–170, and 223–224, have been removed, resulting a total of 188 spectral bands. We take a subimage of the whole data set, which consists of 250191 pixels; see Fig. 11 for this subimage at spectral band . The dictionary that we use here is also the same as the one that has been used in the simulations. It has been noticed that there are calibration mismatches between the real image spectra of this scene and the spectra available in the U.S.G.S. library [5, 6, 7]. Hence, this dataset is suitable for verifying our proposed algorithm.
We first apply RMUSIC and MUSIC on this data set. We adopt the following way to qualitatively evaluate the performance: From the previous studies in [5, 6, 7], we know that the prominent materials are Alunite, Buddingtonite, Chalcedony, and Mmontmorillonite. Fig. 12 shows the residues obtained by applying MUSIC (top) and RMUSIC (buttom). For RMUSIC, we set . The red circles correspond to the library members associated with Alunite, Buddingtonite, Chalcedony, and Mmontmorillonite. We see that the residues given by RMUSIC can be clearly separated into two groups, and the group with smaller residues include the spectra of the materials that we wish to identify. We should emphasize that the situation in Fig. 12 (bottom) is desirable in practice: since the residues associated with the spectra are clearly divided into to two clusters, it is easy to decide which spectra should be kept in the pruned dictionary. In this experiment, we simply keep the spectra below the green line, which is drawn by visual inspection, and this results in a pruned dictionary with spectra. Compared to the original size , RMUSIC successfully reduces the dictionary size by 75% while preserving the spectra associated with the prominent materials.
We follow the method as in [5, 6, 7] to compare the abundance map estimation results of RMUSICCSR and RMUSICDANSER. Specifically, we plot the classification maps yielded by the U.S.G.S. Tetracorder software [36] and the estimated abundance maps of Alunite, Buddingtonite, Chalcedony, and Mmontmorillonite by RMUSICCSR and RMUSICDANSER in Fig. 13  Fig. 16. As mentioned in [5, 6, 7], the classification maps are based on the older version Cuprite data captured in 1995, while the hyperspectral image was captured in 1997, which means that the details of the new data may not be fully revealed by the classification maps  but it still makes a good reference for visual evaluation. We see that for Alunite, Buddingtonite, and Chalcedony, RMUSICCSR and RMUSICDANSER yield similar abundance maps. However, for Chalcedony and Mmontmorillonite, the abundances given by RMUSICDANSER generally have stronger intensities all over the area of interest. Also, by enumerating the nonzero rows of the solution, it is noticed that DANSER identifies active spectra from the pruned dictionary that consists of 79 spectra, indicating that the number of materials in the subimage is 15. This result is close to that yielded by HySiMe [24]; HySiMe is reliable in estimating the number of materials, and its estimate of this scene is 16. At the same time, CSR selects 19 active spectra. This observation also verifies our claim that using quasinorm yields sparser solution.
6 Conclusion
In this work, we have developed a dictionaryaided semiblind HU method that takes into account the spectral signature mismatch problem. We proposed a dictionaryadjusted CSR formulation with a nonconvex collaborative sparsity promoting regularizer. By a careful reformulation, an alternating optimization algorithm with simple periteration updates was proposed. A new dictionary pruning algorithm based on a spectral mismatchrobust MUSIC criterion was also proposed. Simulations and realdata experiments showed that the proposed algorithms are promising in improving the HU performance compared to the prior works.
Appendix
A Proof of Proposition 1
First, we claim that any limit point of the solution sequence generated by the DANSER algorithm in Algorithm 1 is a stationary point of Problem (12) (but not Problem (10) at this moment). The claim is obtained by applying a general alternating optimization (AO) result in [37, Proposition 2.7.1], which says that every limit point of a solution sequence generated by an AO algorithm is a stationary point of its tackled problem if each partial optimization problem in AO is strictly convex and has a continuously differentiable objective function within the interior of its feasible set. In our case, one can see that the partial optimizations of Problem (12) w.r.t. , , and satisfy the above condition.
Second, we claim that a stationary point of Problem (12) is also a stationary point of Problem (10). The proof is as follows. For notational convenience, let , , and denote
as the objective functions of Problem (12) and Problem (10), resp., where
Also, recall from the development in Section III.B that
Now, let be a stationary point of Problem (12), which, by definition, satisfies
(25a)  
(25b) 
where and denote the gradient of w.r.t. and , resp., and denotes the feasible set of in Problem (12) or Problem (10). From (25a), we observe that
(26) 
and the argument is as follows: is strictly convex w.r.t. by Lemma 1; and as a result of the optimality conditions of convex optimization, (25a) holds if and only if is the optimal solution to . Eq. (26) implies that . Consequently, we can rewrite (25b) as
The above equation is identical to the definition for to be a stationary point of Problem (10). Hence, we have proven that for any stationary point of Problem (12), the part is a stationary point of Problem (10).
Finally, combining the above two claims leads to the conclusion in Proposition 1.
B Proof of Proposition 2
Recall that we aim at solving
(27) 
where
(28) 
By the triangle inequality, we have
(29) 
where equality holds if and only if i) , , and ii) , . The two conditions above can be satisfied simultaneously by setting