Frequency estimation based on Hankel matrices and the alternating direction method of multipliers

# Frequency estimation based on Hankel matrices and the alternating direction method of multipliers

Fredrik Andersson fa@maths.lth.se Centre for Mathematical Sciences, Lund University, Lund, Sweden Marcus Carlsson mc@maths.lth.se Centre for Mathematical Sciences, Lund University, Lund, Sweden Jean-Yves Tourneret jean-yves.tourneret@enseeiht.fr University of Toulouse, CNRS UMR 5505, INP-ENSEEIHT, Toulouse, France Herwig Wendt herwig.wendt@irit.fr University of Toulouse, CNRS UMR 5505, INP-ENSEEIHT, Toulouse, France

# Frequency estimation based on Hankel matrices and the alternating direction method of multipliers

Fredrik Andersson fa@maths.lth.se Centre for Mathematical Sciences, Lund University, Lund, Sweden Marcus Carlsson mc@maths.lth.se Centre for Mathematical Sciences, Lund University, Lund, Sweden Jean-Yves Tourneret jean-yves.tourneret@enseeiht.fr University of Toulouse, CNRS UMR 5505, INP-ENSEEIHT, Toulouse, France Herwig Wendt herwig.wendt@irit.fr University of Toulouse, CNRS UMR 5505, INP-ENSEEIHT, Toulouse, France
###### Abstract

We develop a parametric high-resolution method for the estimation of the frequency nodes of linear combinations of complex exponentials with exponential damping. We use Kronecker’s theorem to formulate the associated nonlinear least squares problem as an optimization problem in the space of vectors generating Hankel matrices of fixed rank. Approximate solutions to this problem are obtained by using the alternating direction method of multipliers. Finally, we extract the frequency estimates from the con-eigenvectors of the solution Hankel matrix. The resulting algorithm is simple, easy to implement and can be applied to data with equally spaced samples with approximation weights, which for instance allows cases of missing data samples. By means of numerical simulations, we analyze and illustrate the excellent performance of the method, attaining the Cramér-Rao bound.

Keywords: frequency estimation, nonlinear least squares, Hankel matrices, Kronecker’s theorem, missing data, alternating direction method of multipliers

## 1 Introduction

Spectral estimation constitutes a classical problem that has found applications in a large variety of fields (including astronomy, radar, communications, economics, medical imaging, spectroscopy, …, to name but a few). One important category of spectral estimation problems arises for signals that can be well represented by the parametric model

 fc(t)=P−1∑p=0cpeζpt+e(t),cp,ζp∈C, (1)

where is an additive noise term. Given a vector of (typically equally spaced) samples,

 f(j)=fc(t0+jts),0≤j≤2N

where is the sampling period, the goal is to estimate the complex frequency nodes i.e., the damping and frequency parameters and . Note that once the parameters have been computed, determining reduces to a simple linear regression problem. Thus, the focus lies on the estimation of the nodes . The signal model (1) includes the case of exponentially damped signals defined by . This case has recently received significant interest, notably in applications involving nuclear quadrupole resonance signals for which the (together with ) are used as a signature of the chemical composition of an analysed substance (see for instance  and references therein).

The literature on methods for estimating in this setting is rich, cf.  for an overview. Here, we focus on so-called high-resolution methods, i.e., methods whose frequency resolution is not tied to a pre-defined grid (and can attain numerical precision in the noise free case).
One important class of techniques is based on the principle of nonlinear least squares (NLS)  and aims at solving the nonlinear regression problem associated with (1), i.e., at minimizing

 θ(c,ζ)=2N∑j=0∣∣f(j)−P−1∑p=0cpeζpj∣∣2

with respect to the parameter vectors and . Although NLS enjoy desirable theoretical properties (e.g., those of maximum likelihood estimators when is a Gaussian white noise; robustness to coloured noise), their practical use remains limited due to the extreme difficulty of globally minimizing due to its pronounced multimodal shape with one very sharp global minimum (cf., e.g.,  and references therein).
A second prominent and popular class is given by subspace methods (such as MUSIC , ESPRIT  and min-norm ). These methods are based on estimates of the covariance of and rely on the assumption that the noise is white.

In the present contribution, we develop a novel high-resolution methodology for the estimation of in (1). Similar to NLS, we aim at approximating as good as possible by a linear combination of complex exponentials. However, the proposed methodology fundamentally departs from any of the above methods in the following ways:

First, it is based on Kronecker’s theorem for complex symmetric matrices, which essentially states that if a function is uniformly sampled, then the Hankel matrix that is generated by the vector of samples has rank if and only if coincides at the sample points with a function that is a linear combination of exponential functions. This fact has been used for the sparse approximation of functions by sums of exponentials in the con-eigenvalue approach  and the alternating projections method . Here, it is used to formulate an NLS type minimization problem in which the model (and its parameters and ) does not enter explicitly. Instead, the model is imposed implicitly by constraining the rank of the Hankel matrix generated by the vector approximating to equal . Consequently, residual minimization is not performed over the parameter space directly, but over the space of vectors which generate Hankel matrices of rank . The frequency nodes are then obtained by considering the con-eigenvectors of the solution Hankel matrix.

Second, we reformulate the minimization problem such that it can be effectively solved by the alternating direction method of multipliers (ADMM) . ADMM is an iterative technique that is recently gaining popularity due to its robustness, versatility and applicability to large-scale problems. Moreover, this technique enjoys performance comparable to problem-specific state-of-the-art methods. While the optimization problem considered here is nonconvex, it is shown numerically that the solutions obtained with the ADMM procedure generically provide excellent approximations and parameter estimates for the model (1).

The resulting Hankel matrix ADMM frequency estimation procedure is practically extremely appealing due to its simplicity and ease of implementation. It enjoys excellent performance, outperforming subspace methods while at the same time alleviating the practical limitations of classical NLS. Unlike subspace methods, it does not rely on a specific noise model. Another interesting property is that it applies to situations with missing (or censored) data.

The remainder of this work is organized as follows. In Section 2, we define an optimization problem based on Kronecker’s theorem for the approximation by sums of complex exponentials and formulate the ADMM-based procedure for obtaining its solution. Section 3 summarizes the procedure for obtaining frequency node estimates from the solution Hankel matrix. In Section 4, we analyze the performance of the method by means of numerical simulations. Section 5 concludes this contribution and points to future work.

## 2 Approximation by sums of exponentials using ADMM

In this section, we approximate the vector as good as possible by a vector that is a linear combination of complex exponential functions. Using Kronecker’s theorem, we express the solution to this approximation problem in terms of the vector whose Hankel matrix has rank and minimizes the residual. The resulting optimization problem will then be reformulated and solved by ADMM.

### 2.1 Hankel matrices and Kronecker’s theorem

For completeness, we recall that a Hankel matrix has constant values on the anti-diagonals, i.e.,

 A(j,k)=A(j′,k′) if j+k=j′+k′.

It can thus be generated elementwise from a vector , such that

 A(j,k)=g(j+k),0≤j,k≤N.

We denote this by .

Kronecker’s theorem [1, 3] states that if the Hankel matrix generated by the vector is of rank then, with the exception of degenerate cases, there exists and such that

 g(j)=P−1∑p=0cpeζpj,cp,ζp∈C. (2)

The converse holds as well.

It follows that the best approximation (in the sense) of by a linear combination of complex exponentials is given by the vector which satisfies and minimizes the norm of the residual . In other words, can be obtained as the solution of the optimisation problem

 minimizeg 12∥r∥22=12∥f−g∥22 (3) subject to rank(Hg)=P.

The parameter can be selected by considering the singular values of the Hankel matrix generated by the samples , cf., . At this stage, we assume to be given.

### 2.2 A solution based on ADMM

Let be the indicator function for square matrices given by

 RP(S)={0if rank(S)=P,∞otherwise.

Then, problem (3) can be reformulated as follows

 minimizeA,r RP(A)+12∥r∥22 (4) subject to A(j,k)+r(j+k)=f(j+k),0≤j,k≤N.

The optimization problem (4) has a cost function that consists of two terms, each depending only on one of the variables and , along with a linear constraint. The problem formulation is therefore well suited to be addressed using ADMM. ADMM is an iterative technique in which a solution to a large global problem is found by coordinating solutions to smaller subproblems. It can be seen as an attempt to merge the benefits of dual decomposition and augmented Lagrangian methods for constrained optimization. For an overview of the ADMM method see . For a convex cost function ADMM is guaranteed to converge to the unique solution. For non-convex problems it can get stuck at local minima, although it can work quite well in practice also in these situations, cf. [5, Chapter 9]. Since the rank constraint is nonconvex, convergence of an algorithm solving (4) is not guaranteed. Our numerical simulations indicate that the method works substantially better than established high resolution techniques like ESPRIT, and moreover that it can be applied to cases where ESPRIT is non-applicable (e.g., missing data)111A convex problem could be obtained by replacing the by the nuclear norm, at the cost of biased solutions..

An ADMM iterative step (from iteration to iteration ) for (4) reads

 Aq+1=argmin AL(A,rq,Λq), (5) rq+1=argmin rL(Aq+1,r,Λq), (6) Λq+1(j,k)=Λq(j,k)+ρ(Aq+1(j,k) +rq+1(j+k)−f(j+k)) (7)

where are the Lagrange multipliers, and is the augmented Lagrangian associated with (4),

The first minimization step (5) is nonconvex due to the projection onto a non convex set but can be solved analytically. Deriving closed form expressions for both minimization steps steps (5) and (6) is straightforward. The solution to the first minimization step is equal to the best rank approximation of the matrix defined elementwise by

 B(j,k)=f(j+k)−rq(j+k)−ρ−1Λq(j,k).

Denoting by the singular value decomposition of , then by the Eckart-Young theorem,

 Aq+1=UΣPV∗, (8)

where is obtained from the diagonal matrix by replacing the diagonal elements for with zeros.
The solution to the second minimization step is given by

 rq+1(l)=ρQ(l)f(l)−∑j+k=l[Λq(j,k)+ρAq+1(j,k)]1+ρQ(l), (9)

where stands for and

 Q(l)={l+1if l≤N,2N+1−lotherwise.

With these explicit expressions for the ADMM minimization steps, the procedure for the approximation by sums of exponentials is given by the following simple MATLAB function:

2 %Best approximation to f using P exponentials.
3 N=(length(f)-1)/2;r=zeros(2*N+1,1);
4 Lambda=zeros(N+1);Q=[1:N+1,N:-1:1]';
5 for iter=1:it_max,
6  B=form_hank(f-r)-Lambda/rho;[u,s,v]=svd(B);
7   s=diag(s);s(P+1:end)=0;A=u*diag(s)*v';
8  r=(rho*Q.*f-sum_hank(Lambda+rho*A))./(1+rho*Q);
9  Lambda=Lambda+rho*(A-form_hank(f-r));
10 end;
11 f_appro=sum_hank(A)./Q;
12
13function H=form_hank(f)     %form Hankel matrix
14 H=hankel(f(1:(end+1)/2),f((end+1)/2:end));
15function f=sum_hank(H)      %sum anti-diagonals
16 N=size(H,1)-1; f=zeros(2*N+1,1); H=flipud(H);
17 for j=-N:N, f(j+N+1,1)=sum(diag(H,j)); end;

### 2.3 Extension to the missing data case

It is straightforward to replace in problem (4) by a weighted norm

 ||r||22,w=2N∑j=0w(j)|r(j)|2

and to derive the corresponding ADMM procedure. The only change in the above final expressions is that the denominator in the second minimization step (9) (and in the corresponding line 8 of the MATLAB code) is replaced by .

Let denote the set of indices of the missing data samples and set . The use of weights defined by

 w(j)=wJ(j)={0if j∈J1otherwise

yields an ADMM procedure for data with missing samples.

### 2.4 Convergence

In order to highlight the simplicity of the proposed algorithm, the ADMM procedure was described above to terminate after a preset number of iterations. Commonly used stopping criteria are based on the magnitude of the primal and dual residuals and are straight-forward to incorporate in the ADMM procedure (cf.,  for details).

The ADMM procedure would converge to the global solution of the problem (4) if the problem was convex. Since it is non convex, it is only guaranteed to converge locally and can converge to different points depending on the choice of and initial points and . Numerical experiments indicate, however, that the ADMM solution of the problem (4) provides in general an excellent approximation for . Here, we have initialized and with zeros.

## 3 Frequency estimation

Once the solution Hankel matrix to (4) has been found, the parameter vector can be obtained by considering the so-called con-eigenvectors of . Hankel matrices belong to the class of complex symmetric matrices satisfying , which generically can be decomposed as 

 A=N∑p=0spupuTp,sp∈R+,up∈CN+1, (10)

where are the decreasingly ordered con-eigenvalues of and where the con-eigenvectors are orthogonal and satisfy the relation . Similarly to the Eckart Young theorem, it can be shown that the best rank approximation of is given by 

 P−1∑p=0spupuTp. (11)

Since is a sum of exponentials, can be expressed in the form (11), and therefore each con-eigenvector will also be a sum of the same exponentials. Let . It is then possible to write , where is the Vandermonde matrix generated by (i.e., ) and is some (invertible) matrix. Denote as (resp. ) the matrix whose first row (resp. last row) has been dropped. Clearly, we have , and the Vandermonde structure of leads to It follows that

 (¯¯¯¯U∗¯¯¯¯U)−1(¯¯¯¯U∗U––)=¯¯¯¯U†U––=G−1diag(eζ0,…,eζP−1)G,

where stands for conjugate transpose, where denotes the pseudo-inverse operator. Therefore, we can compute the nodes by computing the eigenvalues of .

## 4 Estimation performance

We analyze the estimation performance of the proposed frequency estimation algorithm by considering different numerical simulations conducted with and complex exponentials embedded in circular white Gaussian noise ( independent realisations) for different signal-to-noise ratios (SNR). The model parameters and are summarized in Tab. 1 (units correspond to uniform sampling of the interval ). The performance are compared with the theoretical Cramér-Rao bounds (CRBs) for the estimation problem (see, e.g., [7, 12]).

### 4.1 Estimation performance for uniformly sampled data

In Figure 1, the standard deviations (STDs) of the estimates of and , , obtained with the ADMM procedure and with the ESPRIT method, are plotted as a function of SNR together with the (square root of the) CRBs. Note that the components and are close in frequency but significantly differ in amplitude (cf., Tab. 1). We observe that the proposed ADMM procedure provides estimates attaining the corresponding CRBs, indicating that the procedure realizes the minimum variance estimator. It consistently outperforms the ESPRIT method in terms of STD, and especially so for small SNR and for the two components that are close in frequency.

### 4.2 Estimation performance for missing data

In Fig. 2, we summarize the performance of the ADMM procedure for estimating , , for samples of which samples are missing at random positions (left) and as one consecutive block centred at random position (right). The STDs are again found to reach the (square root of the) CRBs, demonstrating that the ADMM procedure is not affected by missing data. Similar results are obtained for and are not reported here for space limitation reasons.

### 4.3 Approximation for missing data

Finally, Figure 3 illustrates the approximation performance of the ADMM procedure for samples of exponentials of which two blocks of 60 consecutive samples are missing (, mean over 100 realizations). Despite the fact that of the samples are missing, the obtained approximations are notably consistent with the underlying noise-free function. The maximum observed differences are found to be below times the sampling frequency.

## 5 Conclusions

A high-resolution parametric frequency estimation procedure that is based on approximation with sums of complex exponentials was proposed. In the proposed algorithm, the Kronecker theorem was used to cast the approximation problem in terms of generating functions for Hankel matrices of rank . The resulting optimization problem was addressed by an ADMM procedure. The Hankel matrices obtained from the ADMM procedure were then used to compute the parameters of the complex exponential model. This is in contrast to other methods, such as classical NLS or subspace methods. Although the optimization problem considered in this paper is non-convex (and the problems of local minima are inherited with non-convexity, as is the case for classical NLS) numerical simulations indicated that the ADMM procedure yields excellent practical performance.

The ADMM procedure can be applied to equally spaced samples, including situations with missing data, and does not rely on explicit noise model assumptions. Despite its versatility, the resulting algorithm is simple and easy to implement. The method has been presented here for complex-valued data. However, its real-valued counterpart can be obtained in a similar fashion as outlined in this work.

The procedure will be applied to frequency node estimation in nuclear quadrupole resonance applications. Future work includes the extension to multiple time series and to the estimation of directions of arrival.

## 6 Acknowledgment

The work was supported by the Swedish Foundation for International Cooperation in Research and Higher Education, the Swedish Research Council and the Craaford Foundation. Part of this work was conducted within the Labex CIMI during visits of F. Andersson at University of Toulouse.

## References

•  V. M. Adamjan, D. Z. Arov, and M. G. Kreĭn. Infinite Hankel matrices and generalized problems of Carathéodory-Fejér and F. Riesz. Funkcional. Anal. i Priložen., 2(1):1--19, 1968.
•  F. Andersson and M. Carlsson. Alternating projections on non-tangential manifolds. arXiv:1107.4055, 2011.
•  F. Andersson, M. Carlsson, and M.V. de Hoop. Sparse approximation of functions using sums of exponentials and aak theory. Journal of Approximation Theory, 163(2):213--248, February 2011.
•  G. Beylkin and L. Monzon. On approximation of functions by exponential sums. Applied and Computational Harmonic Analysis, 19(1):17--48, July 2005.
•  S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3(1):1--122, 2011.
•  Y. Bresler and A. Macovski. Exact maximum likelihood parameter estimation of superimposed exponential signals in noise. IEEE Trans. Acoustic, Speech and Signal Process., 34(5):1081--1089, 1986.
•  E. Gudmundson, P. Wirfalt, A. Jakobsson, and M. Jansson. An esprit-based parameter estimator for spectroscopic data. In IEEE Statistical Signal Processing Workshop (SSP’12), pages 77 -- 80, 2012.
•  R. A. Horn and C. R. Johnson. Topics in matrix analysis. Cambridge University Press, Cambridge, 1994.
•  R. Kumaresan and D. W. Tufts. Estimating the angles of arrival of multiple plane waves. IEEE Trans. Aerospace and Electronic Systems, 19:134--139, 1983.
•  R. Roy and T. Kailath. ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans. on Acoustics, Speech and Signal Process., 37(7):984--995, 1989.
•  R. Schmidt. Multiple emitter location and signal parameter estimation. IEEE Trans. on Antennas and Propagation, 34(3):276 -- 280, 1986.
•  P. Stoica and R. Moses. Spectral analysis of Signals. Prentice--Hall, 2005.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters   