Spectral Super-resolution With Prior Knowledge

Spectral Super-resolution With Prior Knowledge

Kumar Vijay Mishra, Myung Cho, Anton Kruger, and Weiyu Xu The authors are with the Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA, 52242 USA, e-mail: {kumarvijay-mishra, myung-cho, anton-kruger, weiyu-xu}@uiowa.eduPart of this work has been previously presented in 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
Abstract

We address the problem of super-resolution frequency recovery using prior knowledge of the structure of a spectrally sparse, undersampled signal. In many applications of interest, some structure information about the signal spectrum is often known. The prior information might be simply knowing precisely some signal frequencies or the likelihood of a particular frequency component in the signal. We devise a general semidefinite program to recover these frequencies using theories of positive trigonometric polynomials. Our theoretical analysis shows that, given sufficient prior information, perfect signal reconstruction is possible using signal samples no more than thrice the number of signal frequencies. Numerical experiments demonstrate great performance enhancements using our method. We show that the nominal resolution necessary for the grid-free results can be improved if prior information is suitably employed.

super-resolution, atomic norm, probabilistic prior, block prior, known poles.

I Introduction

In many areas of engineering, it is desired to infer the spectral contents of a measured signal. In the absence of any a priori knowledge of the underlying statistics or structure of the signal, the choice of spectral estimation technique is a subjective craft [1, 2]. However, in several applications, the knowledge of signal characteristics is available through previous measurements or prior research. By including such prior knowledge during spectrum estimation process, it is possible to enhance the performance of spectral analysis.

One useful signal attribute is its sparsity in spectral domain. In recent years, spectral estimation methods that harness the spectral sparsity of signals have attracted considerable interest [3, 4, 5, 6]. These methods trace their origins to compressed sensing (CS) that allows accurate recovery of signals sampled at sub-Nyquist rate [7]. In the particular context of spectral estimation, the signal is assumed to be sparse in a finite discrete dictionary such as Discrete Fourier Transform (DFT). As long as the true signal frequency lies in the center of a DFT bin, the discretization in frequency domain faithfully represents the continuous reality of the true measurement. If the true frequency is not located on this discrete frequency grid, then the aforementioned assumption of sparsity in the DFT domain is no longer valid [8, 9]. The result is an approximation error in spectral estimation often referred to as scalloping loss [10], basis mismatch [11], and gridding error [12].

Recent state-of-the-art research [5, 6, 13] has addressed the problem of basis mismatch by proposing compressed sensing in continuous spectral domain. This grid-free approach is inspired by the problems of total variation minimization [5] and atomic norm minimization [6] to recover super-resolution frequencies - lying anywhere in the continuous domain - with few random time samples of the spectrally sparse signal, provided the line spectrum maintains a nominal separation. A number of generalizations of off-the-grid compressed sensing for specific signal scenarios have also been attempted, including extension to higher dimensions [14, 15, 16].

However, these formulations of off-the-grid compressed sensing assume no prior knowledge of signal other than sparsity in spectrum. In fact, in many applications, where signal frequencies lie in continuous domain such as radar [17], acoustics [18], communications [19], and power systems [20], additional prior information of signal spectrum might be available. For example, a radar engineer might know the characteristic speed with which a fighter aircraft flies. This knowledge then places the engineer in a position to point out the ballpark location of the echo from the aircraft in the Doppler frequency spectrum. Similarly, in a precipitation radar, the spectrum widths of echoes from certain weather phenomena (tornadoes or severe storms) are known from previous observations [21]. This raises the question whether we can use signal structures beyond sparsity to improve the performance of spectrum estimation.

There are extensive works in compressed sensing literature that discuss recovering sparse signals using secondary signal support structures, such as structured sparsity [22] (tree-sparsity [23], block sparsity [24], and Ising models [25]), spike trains [26, 27], nonuniform sparsity [28, 29], and multiple measurement vectors (MMVs) [30]. However, these approaches assume discrete-valued signal parameters while, in the spectrum estimation problem, frequencies are continuous-valued. Therefore, the techniques of using prior support information in discrete compressed sensing for structured sparsity do not directly extend to spectrum estimation. Moreover, it is rather unclear as to how general signal structure constraints can be imposed for super-resolution recovery of continuous-valued frequency components.

In this paper, we focus on a more generalized approach to super-resolution that addresses the foregoing problems with line spectrum estimation. We propose continuous-valued line spectrum estimation of irregularly undersampled signal in the presence of structured sparsity. Prior information about the signal spectrum comes in various forms. For example, in the spectral information concerning a rotating mechanical system, the frequencies of the supply lines or interfering harmonics might be precisely known [31]. However, in a communication problem, the engineer might only know the frequency band in which a signal frequency is expected to show up. Often the prior knowledge is not even specific to the level of knowing the frequency subbands precisely. The availability of previous measurements, such as in remote sensing or bio-medicine, can aid in knowing the likelihood of having an active signal frequency in the neighborhood of a specific spectral band. In this paper, we greatly broaden the scope of prior information that can range from knowing only the likelihood of occurrence of frequency components in a spectral subband to exactly knowing the location of some of the frequencies.

In all these cases, we propose a precise semidefinite program to perfectly recover all the frequency components. When some frequencies are precisely known, we propose to use conditional atomic norm minimization to recover the off-the-grid frequencies. In practice, the frequencies are seldom precisely known. However, as long as the frequency locations are approximately known to the user, we show that the spectrally sparse signal could still be perfectly reconstructed. Here, we introduce constrained atomic norm minimization that accepts the block priors - frequency subbands in which true spectral contents of the signal are known to exist - in its semidefinite formulation. When only the probability density function of signal frequencies is known, we incorporate such a probabilistic prior in the spectral estimation problem by suggesting the minimization of weighted atomic norm. The key is to transform the dual of atomic norm minimization to a semidefinite program using linear matrix inequalities (LMI). These linear matrix inequalities are, in turn, provided by theories of positive trigonometric polynomials [32]. Our methods boost the signal recovery by admitting lesser number of samples for spectral estimation and decreasing reliance on the minimum resolution necessary for super-resolution. If the prior information locates the frequencies within very close boundaries of their true values, then we show that it is possible to perfectly recover the signal using samples no more than thrice the number of signal frequencies.

Our work has close connections with a rich heritage of research in spectral estimation. For uniformly sampled or regularly spaced signals, there are a number of existing approaches for spectral estimation by including known signal characteristics in the estimation process. The classical Prony’s method can be easily modified to account for known frequencies [18]. Variants of the subspace-based frequency estimation methods such as MUSIC (MUltiple SIgnal Classification) and ESPRIT (Estimation of Signal Parameters via Rotation Invariance Techniques) have also been formulated [33, 31], where prior knowledge can be incorporated for parameter estimation. For applications wherein only approximate knowledge of the frequencies is available, the spectral estimation described in [34] applies circular von Mises probability distribution on the spectrum.

For irregularly spaced or non-uniformly sampled signal, sparse signal recovery methods which leverage on prior information have recently gained attention [28, 29, 35, 36]. Compressed sensing with clustered priors was addressed in [37] where the prior information on the number of clusters and the size of each cluster was assumed to be unknown. In [38], MUSIC was extended to undersampled, irregularly spaced sparse signals in a discrete dictionary, while [39] analyzed the performance of snapshot-MUSIC for uniformly sampled signals in a continuous dictionary. Our technique is more general; it applies to irregularly sampled signals in a continuous dictionary, and is, therefore, different from known works on utilizing prior information for spectral estimation of regularly sampled signals.

Ii Problem Formulation

In general, the prior information can be available for any of the signal parameters such as amplitude, phase or frequencies. However, in this paper, we restrict the available knowledge to only the frequencies of the signal. We assume that the amplitude and phase information of any of the spectral component is not known, irrespective of the pattern of known frequency information. Our approach is to first analyze the case of a more nebulous prior information, that is the probabilistic priors, followed by an interesting special case of block priors. The case when some frequencies are precisely known is considered in the end where, unlike previously considered cases, we recover the signal using the semidefinite program for the primal problem.

We consider a frequency-sparse signal expressed as a sum of complex exponentials,

(II.1)

where () represents the complex coefficient of the frequency , with amplitude , phase , and frequency-atom . We use the index set , where , to represent the time samples of the signal. We further suppose that the signal in (II.1) is observed on the index set , where observations are chosen uniformly at random. Our objective is to recover all the continuous-valued the frequencies with very high accuracy using this undersampled signal.

The signal in (II.1) can be modeled as a positive linear combination of the unit-norm frequency-atoms where is the set of all the frequency-atoms. These frequency atoms are basic units for synthesizing the frequency-sparse signal. This leads to the following formulation of the atomic norm - a sparsity-enforcing analog of norm for a general atomic set :

(II.2)

To estimate the remaining samples of the signal , [40] suggests minimizing the atomic norm among all vectors leading to the same observed samples as . Intuitively, the atomic norm minimization is similar to -minimization being the tightest convex relaxation of the combinatorial -minimization problem. The primal convex optimization problem for atomic norm minimization can be formulated as follows,

(II.3)

Equivalently, the off-the-grid compressed sensing [6] suggests the following semidefinite characterization for :

Definition II.1.

[6] Let denote the positive semidefinite Toeplitz matrix, , Tr() denote the trace operator and denote the complex conjugate. Then,

(II.4)

The positive semidefinite Toeplitz matrix is related to the frequency atoms through the following Vandermonde decomposition result by Carathèodory [41]:

(II.5)

where

(II.6)
(II.7)

The diagonal elements of are real and positive, and .

Consistent with this definition, the atomic norm minimization problem for the frequency-sparse signal recovery can now be formulated as a semidefinite program (SDP) with affine equality constraints:

(II.8)

When some information about the signal frequencies is known a priori, then our goal is to find a signal vector in (II) whose frequencies satisfy additional constraints imposed by prior information. In other words, if denotes the set of constraints arising due to prior knowledge of frequencies, then our goal is to find the infimum in (II.2) over .

While framing the problem to harness the prior information, a common approach in compressed sensing algorithms is to replace the classical minimization program with its weighted counterpart [28, 29]. However, signals with continuous-valued frequencies do not lead to a direct application of the weighted approach. Rather, such an application leads to a fundamental conundrum: the Vandermonde decomposition of positive semidefinite Toeplitz matrices works for general frequencies wherein the frequency atom in (II.6) can freely take any frequency and phase values, and it is not clear how to further tighten the positive semidefinite Toeplitz structure to incorporate the known prior information. Thus, it is non-trivial to formulate a computable convex program that can incorporate general prior information to improve signal recovery.

Iii Probabilistic Priors

In the probabilistic prior model, the probability density function of the frequencies is known. Let be the random variable that describes the signal frequencies. Let the probability density function (pdf) of F be . The problem of line spectrum estimation deals with a finite number of signal frequencies in the domain [0, 1]. For example, we can assume to be piecewise constant as follows. Let the domain consist of disjoint subbands such that where denotes a subband or a subset of . Then the restriction of to is a constant. Figure III.1 illustrates a simple case for , where the line spectrum of a signal is non-uniformly sparse over two frequency subbands and , such that the frequencies , , occur in the subinterval more likely than in .

Figure III.1: The probability density function of the frequencies shown with the location of true frequencies in the spectrum of the signal .

Intuitively, given probabilistic priors, one may think of recovering the signal by minimizing a weighted atomic norm given by:

(III.1)

where is the weight vector, each element of which is associated with the probability of occurrence of the corresponding signal frequency . The weight vectors are assigned using a weight function . is a piecewise constant function in the domain such that the restriction of to is a constant. Therefore, , we have (say). The is a decreasing function of the sparsity associated with the corresponding frequency subband so that the subband with higher (lower) value of pdf or lesser (more) sparsity is weighted lightly (heavily).

The problem of line spectral estimation using probabilistic prior can now be presented as the (primal) optimization problem concerning the weighted atomic norm:

(III.2)

But we now observe that, unlike weighted norm [28], a semidefinite characterization of the weighted atomic norm does not evidently result from (II). Instead, we propose a new semidefinite program for the weighted atomic norm using theories of positive trigonometric polynomials, by looking at its dual problem. For the standard atomic norm minimization problem (II), the dual problem is framed in this manner:

subject to (III.3)

where represents the dual norm. This dual norm is defined as

(III.4)

For the weighted atomic norm minimization, the primal problem (III) has only equality constraints. As a result, Slater’s condition is satisfied and, therefore, strong duality holds [42]. In other words, solving the dual problem also yields an exact solution to the primal problem. The dual of weighted atomic norm is given by

(III.5)

The dual problem to (III) can be stated hence,

subject to (III.6)

which by substitution of (III) becomes,

subject to (III.7)

Let the probabilistic priors consist of distinct weights for different frequency subbands , such that , where and are, respectively, the lower and upper cut-off frequencies for each of the band (Figure III.2). If the probability density function is constant within a frequency band, then the results of the supremums in (III) need not depend on the weight functions, and therefore, the inequality constraint in the dual problem in (III) can be expanded as,

subject to
(III.8)

We now map each of the inequality constraints in the foregoing dual problem to a linear matrix inequality, leading to the semidefinite characterization of the weighted atomic norm minimization. We recognize that the constraints in (III) imply is a positive trigonometric polynomial [32] in , since

(III.9)

Such a polynomial can be parameterized by a particular type of positive semidefinite matrix. Thus, we can transform the polynomial inequality, such as the ones in (III), to a linear matrix inequality.

1

Normalized Frequency, f

0

Figure III.2: The individual frequencies of spectrally parsimonious signal are assumed to lie in known frequency subbands within the normalized frequency domain . We assume that all subbands are non-overlapping so that when , then and .

Iii-a Gram Matrix Parametrization

A trigonometric polynomial , which is also nonnegative on the entire unit circle, can be parametrized using a positive semidefinite, Hermitian matrix (called the Gram matrix) that identifies the polynomial coefficients as a function of its elements [43, p. 23]:

(III.10)

where is an elementary Toeplitz matrix with ones on its th diagonal and zeros elsewhere. Here, corresponds to the main diagonal, and takes positive and negative values for upper and lower diagonals respectively.

For the trigonometric polynomial that is nonnegative only over an arc of the unit circle, we have the following theorem:

Theorem III.1.

[43, p. 12] A trigonometric polynomial

(III.11)

where for which , for any , , can be expressed as

(III.12)

where , and are causal polynomials with complex coefficients, of degree at most and , respectively. The polynomial

(III.13)

where

(III.14)
(III.15)
(III.16)
(III.17)

is defined such that is nonnegative for and negative on its complementary.111cf. Errata to [43] available online. The 2007 print edition of [43] has an error in the expression (III.15).

Since and are causal polynomials, the products and are positive trigonometric polynomials that can each be separately parameterized with Gram matrices and respectively.

Proposition III.2.

A trigonometric polynomial in (III.11) that is nonnegative on the arc or, alternatively, the subband , can be parameterized using the Gram matrices and as follows:

(III.18)

where we additionally require the elementary Toeplitz matrix in the second argument to be a nilpotent matrix of order for . The translation of frequencies between the two domains is given by:

(III.19)
(III.20)
Proof:

Let and be causal polynomials such that, , and , where , and are vectors of coefficients of the causal polynomials and respectively, and , and , are the canonical basis vectors of the corresponding polynomials. Let

From the above, . Let and be the Gram matrices. Then, as shown in (III.10), the parameterization process yields, . Also, by definition, if the Gram matrix is associated with a trigonometric polynomial , then we have

(III.21)

where

This leads to the following expressions:

(III.22)
(III.23)
(III.24)

Substitution of (III.22)-(III.24) in (III.21) gives the following matrix-parametric expression,

Then,

(III.25)

Substitution of matrix parameterizations of and in the expression of completes the proof. ∎

The dual polynomial in (III.9) is nonnegative on multiple non-overlapping intervals, and can therefore be parameterized by as many different pairs of Gram matrices , as the number of subbands . In the following subsection, we relate this parametrization to the corresponding probabilistic weights of the subbands.

Iii-B SDP Formulation

Based on the Bounded Real Lemma [43, p. 127] (which, in turn, is based on Theorem 1), a positive trigonometric polynomial constraint of the type can be expressed as a linear matrix inequality [43, p. 143]. Stating this result for the dual polynomial constraint over a single frequency band, such as those in (III), we have

(III.26)

if and only if there exist positive semidefinite Gram matrices and such that,

(III.27)

where is a halfspace, , and if . This linear matrix inequality representation using positive semidefinite matrix paves way for casting the new dual problem in (III) as a semidefinite program. This above formulation shows that we have changed the inequality form in the convex optimization problem to an equality form allowing semidefinite programming for the weighted atomic norm minimization.

If the cutoff-frequencies or (in domain) are equal to , then we can write such that . For the translated subband , let the corresponding subband in the domain be . Then, the LMI formulation given by (III.2) becomes valid for this subband. However, the polynomial is now evaluated in the domain instead of . The SDP for this frequency translation employs a scaled version of LMI in (III-B),

(III.28)

where

(III.29)

We now state the semidefinite program for weighted atomic norm minimization with the probabilistic priors. We use the LMI representation for each of the inequality constraints in (III) as follows: subject to (III.30) where and

The unknown frequencies in can be identified by the frequency localization approach [6] based on computing the dual polynomial, that we state for the weighted atomic norm problem in Algorithm III.1. We state that this characterization of the spectral estimation is a general way to integrate given knowledge about the spectrum. If the engineer is able to locate the signal frequency in a particular subband with a very high degree of certainty, better results can be obtained using the optimization (III-B). Also, information about signal frequency bands is frequently available through previous research and measurements, especially in problems pertaining to communication, power systems and remote sensing. We consider this more practical case in the following section.

1:   Solve the dual problem (III-B) to obtain the optimum solution .
2:   Let be the unknown frequencies of signal . The unknown frequencies , identify as , where . For , .
3:   The corresponding complex coefficients can be recovered by solving a system of simultaneous linear equations .
Algorithm III.1 Frequency localization for probabilistic priors

Iv Block priors

Of particular interest to spectral estimation are spectrally block sparse signals where certain frequency bands are known to contain all the spectral contents of the signal. Let us assume that all the frequencies of the spectrally sparse signal are known a priori to lie only in a finite number of non-overlapping frequency bands or intervals within the normalized frequency domain . Here, the known set is defined as the set of all frequency bands in which signal frequencies are known to reside. The prior information consists of the precise locations of all the frequency bands - the lower and upper cut-off frequencies and respectively for each of the band - as shown in the Figure IV.1. We, therefore, have , where is the total number of disjoint bands known a priori.

1

Normalized Frequency, f

Amplitude

0

Figure IV.1: The individual frequencies of spectrally sparse signal are assumed to lie in known non-overlapping frequency subbands within the normalized frequency domain .

This block prior problem could easily be considered as a special case of probabilistic priors where the probability of a frequency occurring in known subbands is unity while it is zero for all other subbands. When the frequencies are known to reside in the set of subbands a priori, we propose to minimize a constrained atomic norm for perfect recovery of the signal:

(IV.1)

As noted earlier, to recover all of the off-the-grid frequencies of the signal given the block priors, the direct extension of a semidefinite program from (II) to minimize the constrained atomic norm is non-trivial. We address this problem by working with the dual problem of the constrained atomic norm minimization, and then transforming the dual problem to an equivalent semidefinite program by using theories of positive trigonometric polynomials. We note that in the case of block priors, (III.4) can be written as , where is the dual polynomial. The primal problem of constrained atomic norm minimization is given by

(IV.2)

and, similar to (III), we can formulate the corresponding dual problem as

subject to (IV.3)

where . Since is defined as a union of multiple frequency bands, the inequality constraint in (IV) can be expanded to separate inequality constraints. It can be easily observed that (IV) is a special case of (III) with all the weights being unity and (i. e. the set of bands need not necessarily cover the entire frequency range). While framing the semidefinite program for this problem, we use a linear matrix inequality similar to that in (III-B) with for each of the inequality constraint in (IV), to cast the dual problem constraint into a semidefinite program. So, when all the frequencies are known to lie in disjoint frequency bands, then the semidefinite program for the dual problem in (IV) can be constructed by using equality-form constraints: subject to (IV.4) where and

In the extreme case when any of the known frequency bands have or lying exactly on either or , then the dual-polynomial in IV should be appropriately translated as noted in (III.29).

In many applications, the location of some of the signal frequencies might be precisely known. One could think of this known poles problem as a probabilistic prior problem where the cardinality of some sets is exactly unity (and the associated probability be unity as well), while the remaining frequency subbands have a non-unity probability. However, there are a few differences. For probabilistic priors, the probability distribution function is known for the entire interval while, in case of known poles, the probability distribution of the bands of unknown frequencies is unavailable. Also, unlike block prior formulation, known poles problem does not have zero probability associated with the remaining subbands.

V Known Poles

We now consider the case when some frequency components are known a priori but their corresponding amplitudes and phases are not. Let the index set of all the frequencies be , . Let be the index set of all the known frequencies, and . Namely, we assume that the signal contains some known frequencies , , . For known frequencies, let us denote their complex coefficients as and their phaseless frequency atoms as . We define the conditional atomic norm for the known poles as follows: