A multivariate generalization of Prony’s method
Abstract
Prony’s method is a prototypical eigenvalue analysis based method for the reconstruction of a finitely supported complex measure on the unit circle from its moments up to a certain degree. In this note, we give a generalization of this method to the multivariate case and prove simple conditions under which the problem admits a unique solution. Provided the order of the moments is bounded from below by the number of points on which the measure is supported as well as by a small constant divided by the separation distance of these points, stable reconstruction is guaranteed. In its simplest form, the reconstruction method consists of setting up a certain multilevel Toeplitz matrix of the moments, compute a basis of its kernel, and compute by some method of choice the set of common roots of the multivariate polynomials whose coefficients are given in the second step. All theoretical results are illustrated by numerical experiments.
Key words and phrases : frequency analysis, spectral analysis, exponential sum, moment problem, superresolution.
2010 AMS Mathematics Subject Classification : 65T40, 42C15, 30E05, 65F30
1 Introduction
In this paper we propose a generalization of de Prony’s classical method [10] for the parameter and coefficient reconstruction of univariate finitely supported complex measures to a finite number of variables. The method of de Prony lies at the core of seemingly different classes of problems in signal processing such as spectral estimation, search for an annihilating filter, deconvolution, spectral extrapolation, and moment problems. Thus we provide a new tool to analyze multivariate versions of a broad set of problems.
To recall the machinery of the classical Prony method let and let and pairwise distinct , , be given. Let denote the Dirac measure in on and let
be a finitely supported complex measure on . By the Prony method the and are reconstructed from moments
Since the coefficients , , of the (not a priori known) socalled Prony polynomial
fulfill the linear equations
and are in fact the unique solution to this system with , reconstruction of the is possible by computing the kernel vector of the rank Toeplitz matrix
and, knowing this to be the coefficient vector of , compute the roots of . Afterwards, the coefficients of (that did not enter the discussion until now) can be uniquely recovered by solving a Vandermonde linear system of equations. When attempting to generalize this method to finitely supported complex measures on , it seems natural to think that the unknown parameters could be realized as roots of variate polynomials, and this is the approach we will follow here. As in the univariate case, the coefficients will be given as solution to a suitably constructed system of linear equations. However, for , an added difficulty lies in the fact that a nonconstant polynomial always has uncountably many complex roots, so that a single polynomial cannot be sufficient to identify the parameters as its roots. A natural way to overcome this problem is to consider the common roots of a (finite) set of polynomials. These sets, commonly called algebraic varieties, are the subject of classical algebraic geometry and thus there is an immense body of algebraic literature on this topic from which we need only some basic notions as provided at the end of Section 2.
Our main results are presented in Section 3, which is divided into three parts. In Section 3.1 we prove sufficient conditions to guarantee parameter reconstruction for multivariate exponential sums by constructing a set of multivariate polynomials such that their common roots are precisely the parameters. In Section 3.2 we focus on the case that the parameters lie on the dimensional torus, which allows us to prove numerical stability, provide some implications on the parameter distribution, and construct a single trigonometric polynomial localized at the parameters. Finally, we state a prototypical algorithm of the multivariate Prony method.
In Section 4 we discuss previous approaches towards the multivariate moment problem as can be found in [18, 1] for generic situations, in [27, 25, 17] based on projections of the measure, and in [7, 6, 4, 5] based on semidefinite optimization. Finally, numerical examples are presented in Section 5 and we conclude the paper with a summary in Section 6.
2 Preliminaries
Throughout the paper, the letter always denotes the dimension, , and we let
be the domain for our parameters. For , , we use the multiindex notation . We also let and the fold Cartesian product is called dimensional torus. We start by defining the object of our interest, that is, multivariate exponential sums, as a natural generalization of univariate exponential sums.
Definition 2.1.
A function is a variate exponential sum if there is a finitely supported complex measure on , such that for all , is the th moment of , that is, if there are , , and pairwise distinct such that with we have
for all .
In that case , , and , , are uniquely determined, and is called sparse, the are called coefficients of , and are called parameters of . The set of parameters of is denoted by or, if there is no danger of confusion, simply by .
Remark 2.2.
Let and pairwise distinct , , be given. Then the trigonometric moment sequence of ,
(where denotes the scalar product of and ) is a variate exponential sum with parameters . This case will be analyzed in detail in Section 3.2.
Let be an sparse variate exponential sum with coefficients and parameters , . Our objective is to reconstruct the coefficients and parameters of given an upper bound for and a finite set of samples of at a subset of that depends only on , see also [23].
The following notations will be used throughout the paper. For , let and let . The multilevel Toeplitz matrix
which we also refer to as , will play a crucial role in the multivariate Prony method. Note that the entries of are sampling values of at a grid of points.
Next we establish the crucial link between the matrix and the roots of multivariate polynomials. To this end, let
denote the algebra of variate polynomials and for let
The dimensional subvector space of variate polynomials of maxdegree at most is denoted by
and the evaluation homomorphism at will be denoted by
or simply by . Note that the representation matrix of w.r.t. the canonical basis of and the monomial basis of is given by the multivariate Vandermonde matrix
The connection between the matrix and polynomials that vanish on lies in the observation that, using Definition 2.1, the matrix admits the factorization
(2.1) 
with , , , and a permutation matrix . Therefore the kernel of , corresponding to the polynomials in that vanish on , is a subset of the kernel of .
In order to deal with the multivariate polynomials encountered in this way we need some additional notation. The zero locus of a set of polynomials is denoted by
that is, consists of the common roots of all the polynomials in . For a set ,
is the socalled vanishing ideal of . Finally, for a set of polynomials,
is the ideal generated by . Note that always holds. Subsequently, we identify and and switch back and forth between the matrixvector and polynomial notation. In particular, we do not necessarily distinguish between and its representation matrix , so that e.g. “” makes sense.
3 Main results
In the following two subsections, we study conditions on the degree , and thereby on the number of samples, such that the parameters can be uniquely recovered and the polynomials used to identify them can be computed in a numerically stable way.
3.1 Complex parameters, polynomials, and unique solution
Our first result gives a simple but nonetheless sharp condition on the order of the moments such that the set of parameters and the zero loci and are equal.
Theorem 3.1.
Let be an sparse variate exponential sum with parameters , . If then
Moreover, if this equality holds for all sparse variate exponential sums , then .
Proof.
Let . We start by proving . Since , it is sufficient to prove the case . It is a simple fact that , and that these ideals are pairwise comaximal, and hence we have
which implies . Thus we have where the last equality holds because is finite (and can easily be derived from the above).
Thus it remains to show that . We proceed by proving . To simplify notation, we omit the subscript on the matrices. Let and suppose that has rank . Let and w.l.o.g. let , denoting the first rows of , be of rank . Now the first part of the proof implies the contradiction .
Considering the factorization as in Equation (2.1) and applying Frobenius’ rank inequality (see e.g. [15, 0.4.5 (e)]) yields
which implies . The factorization clearly implies which together with the ranknullity theorem yields the final result.
The converse follows from the fact that for with distinct , any subset such that (which holds, by assumption, for ) necessarily contains a polynomial of (max)degree at least . ∎
Example 3.2.
Let be a sparse variate exponential sum with parameters and . The generating system of given in the proof of Theorem 3.1 is
We start by illustrating the generic case that no two coordinates are equal, i.e., if and . The zero locus of each individual polynomial is illustrated in Figure 3.1, where each axis represents . The zero locus of each linear factor is a complex curve and illustrated by a single line. We note that the set is redundant, i.e. the last three polynomials in the first row of Figure 3.1 are sufficient to recover the points uniquely as their common roots, but there is no obvious general rule which polynomials can be omitted.
Four other point configurations are shown in Figure 3.2. In the first three configurations coordinates of different points agree, which allows to remove some polynomials from . In particular, the third point set which is collinear is generated already by and . The fourth point set is generated either by the above set of polynomials or by and , (which are not elements of ).
Remark 3.3.
Concerning the “natural” generator used in the proof above, we note that although the ideals coincide, the subvector space inclusion
is strict in general as can be seen for , , , and the polynomial . Moreover, we have the cardinality , at least for different coordinates , and thus , i.e., the generator contains many linear dependencies and is highly redundant for large .
Finally, we would like to comment on the degree and the total number of samples with respect to the number of parameters :

A small degree , , and surjective results in an uncountably infinite zero locus , since and thus is generated by less than polynomials.

Increasing the degree results “generically” in a finite zero locus, cf. [1], but “generically” identifies spurious parameters since e.g. for Bézout’s theorem yields with equality in the projective setting (counting the roots with multiplicity), for coprime polynomials .
Remark 3.4.
We discuss a slight modification of our approach. Instead of we take as index set and consider the matrix
instead of . Theorem 3.1 also holds with replaced by with almost no change to the proof. In this way we need only rather than samples of and also allow for arbitrary parameters instead of . While is a multilevel Toeplitz matrix, is a submatrix of a multilevel Hankel matrix, and for the trigonometric setting discussed in the following subsection, it is more natural to consider the moments , , , than , , .
3.2 Parameters on the torus, trigonometric polynomials, and stable solution
We now restrict our attention to parameters , hence for a unique . In this case, fulfills a fold symmetry in the following sense. Let and with , then is a root of the 1stcoordinate conjugate reciprocal polynomial
Since the roots are self reciprocal , we have and thus implies for all choices of a conjugated reciprocal coordinate.
Theorem 3.5.
Let , , , , , and be given. Moreover, let , , be an orthonormal basis with , , and , , then ,
(3.1) 
is a trigonometric polynomial of degree and fulfills for all and if and only if for some .
Proof.
First note that every orthonormal basis , , leads to
for . Moreover, on yields that is indeed a trigonometric polynomial. Finally, Theorem 3.1 assures if and only if . ∎
We proceed with an estimate on the condition number of the preconditioned matrix .
Definition 3.6.
Let and , then
is the separation distance of . For , we say that is separated if .
Theorem 3.7.
Let , , , , , and be separated. Moreover, let , then implies the condition number estimate
where the diagonal preconditioner , , , is well chosen. In particular, .
Proof.
First note, that the matrix is hermitian positive semidefinite and define the condition number by , where denotes the MoorePenrose pseudoinverse. Let , , , and , then we have
and Corollary 4.7 in [20] yields the condition number estimate. The second claim follows since . ∎
In summary, the condition
(3.2) 
allows for unique reconstruction of the parameters and stability is guaranteed when computing the kernel polynomials from the given moments.
Remark 3.8.
Up to the constant , the condition in the assumption of Theorem 3.7 is optimal in the sense that equidistant nodes , , , imply and . We expect that the constant can be improved and indeed, a discrete variant of Ingham’s inequality [19], [27, Lemma 2.1] replaces by but gives no explicit estimate on the condition number.
Moreover, Theorem 3.1 asserts that the condition on the degree with respect to the number ob parameters is close to optimal in the specific setting. We briefly comment on the following typical scenarios for the point set and the relation (3.2):

quasiuniform parameters might be defined via , i.e., ,

equidistant and colinear parameters, e.g. , imply , i.e., both terms are of similar size,

and finally parameters chosen at random from the uniform distribution, imply , see e.g. [29], and thus .
Dropping the condition in (3.2) and restricting to the torus, we still get the following result on how much the roots of the polynomials in the kernel of can deviate from the original set .
Theorem 3.9.
Let , , , , , and be separated, then implies
Proof.
We prove the assertion by contradiction. Let be separated from the point set and let , , constitute a basis of . By definition, we have and thus the augmented Fourier Matrix
fulfills , i.e., . On the other hand, Corollary 4.7 in [20] implies and thus the contradiction . ∎
3.3 Prototypical algorithm
Let be an sparse variate exponential sum with pairwise distinct parameters and be an upper bound. Theorem 3.1 justifies the following prototypical formulation of the multivariate Prony method.
Input:  , 

, 
Output:
The third step, i.e., the computation of the zero locus , is beyond the scope of this paper and several methods can be found elsewhere, see e.g. [2, 22, 33, 34]. We further note that the number of used samples scales as and that standard algorithms for computing the kernel of the matrix have cubic complexity.
4 Other approaches
There are many variants of the one dimensional moment problem from Section 1, originating from such diverse fields as for example signal processing, electro engineering, and quantum chemistry, with as widespread applications as spectroscopy, radar imaging, or superresolved optical microscopy, see e.g. the survey paper [24]. Variants of Prony’s method with an increased stability or a direct computation of the parameters without the detour via polynomial coefficients include for example MUSIC [32], ESPRIT [30], the MatrixPencil method [16], the Approximate Prony method [26], the Annihilating Filter method [35], and methods relying on orthogonal polynomials [12].
Multivariate generalizations of these methods have been considered in [18, 1] by realizing the parameters as common roots of multivariate polynomials. In contrast to our approach, both of these papers have an emphasis on the generic situation where e.g. the zero locus of two bivariate polynomials is finite. In this case, the total number of used moments for reconstruction might indeed scale as the number of parameters but no guarantee is given for a specific instance of the moment problem. A second line of multivariate generalizations [27, 25] decomposes the multivariate moment problem into a series of univariate moment problems via projections of the measure. While again this approach typically works well, the necessary number of apriori chosen projections for a signed measure scales as the number of parameters in the bivariate case [17]. We note that the subset
of the set of generators in the proof of Theorem 3.1 are exactly the univariate polynomials when projecting onto the coordinate axes, see also the first and last zero locus in Figure 3.1.
A different approach to the moment problem from Section 1 has been considered in [7, 6, 4, 5] and termed ‘superresolution’. From a signal processing perspective, knowing the first moments is equivalent to sampling a lowpass version of the measure and restoring the high frequency information from these samples. With the notation of Remark 2.2 the measure with parameters is the unique minimizer of
provided the parameters fulfill a separation condition as in Section 3.2. This is proven via the existence of a socalled dual certificate [7, Appendix A] and becomes computationally attractive by recasting this dual problem as a semidefinite program. The program has just variables in the univariate case [7, Corollary 4.1], but at least we do not know an explicit bound on the number of variables in the multivariate case, see [11, Remark 4.17, Theorem 4.24, and Remark 4.26].
Finally note that there is a large body of literature on the related topic of reconstructing a multivariate sparse trigonometric polynomials from samples, see e.g. [3, 21, 13, 8, 28, 31, 14]. Translated to the situation at hand, all these methods heavily rely on the fact that the parameters are located on a Cartesian grid with mesh sizes for some and deteriorate if this condition fails [9]. Hence, these methods lack one major advantage of Prony’s method, namely that the parameters can, in principle, be reconstructed with infinite precision.
5 Numerical results
All numerical experiments are realized in MATLAB 2014a on an Intel i7, 12GByte, 2.1GHz, Ubuntu 14.04.
Example 5.1 ().
We consider the case with parameters on the torus that we identify with the interval . For a sparse exponential sum some of the associated (trigonometric) polynomials are visualized in Figure 5.1, where we start with the upper bound and also indicate the effects of a preconditioner according to Theorem 3.7 on the roots of the polynomials.
The method introduced in [7] finds a polynomial of the form (3.1) as a solution to a convex optimization problem, whereas we find such a polynomial with Prony’s method. For this comparison we used the MATLAB code provided in [7] and modified it so that it runs for different problem sizes depending on the sparsity . This means that we used roughly samples and random parameters , , satisfying the separation condition in [7]. We only measured the time for finding a polynomial of the form (3.1), since the calculation of the roots is basically the same in both algorithms. In Figure 5.2 (a), where the times needed with cvx are depicted as circles and the times needed by Prony’s method are depicted as crosses, we see that the solution via convex optimization takes considerably more time. Note that the end criterion of the convex optimization program is set to roughly , therefore the solution accuracy does not increase beyond this point, whereas for Prony’s method the solutions in this test are all in the order of machine accuracy, .
Example 5.2 ().
We demonstrate our method to reconstruct the parameters from the moments , . For moments of order and the associated space of polynomials with reverse lexicographical order on the terms, we get the block Toeplitz matrix with the Toeplitz blocks as follows:
A vector space basis of is given by the polynomials
Since , , , , and , we have and hence . The zero loci of are depicted in Figure 5.3 (a) (in the style of Figure 3.1) resp. (b), where the torus is identified with . Note that we would typically expect the intersection of the zero locus of each polynomial with the torus to be finite, which is the case neither for nor . In Figure 5.3 (c) the sum of the squared absolute values of an orthonormal basis of is drawn.
Example 5.3 ().
Figure 5.4 depicts the intersection of (identified with ) and the zero loci of two polynomials that arise with the Prony method for parameters choosing (which is not an upper bound for ). This illustrates that, in the case , the zero locus of a single polynomial intersected with the torus can typically be visualized as a “onedimensional” curve as suggested by the heuristic argument that a complex polynomial can be thought of as two real equations, which together with the three real equations that define as a subset of provides five equations, thus leaving one real degree of freedom.
6 Summary
We suggested a multivariate generalization of Prony’s method and gave sharp conditions under which the problem admits a unique solution. Moreover, we provided a tight estimate on the condition number for computing the kernel of the involved Toeplitz matrix of moments. Numerical examples were presented for spatial dimensions and showed in particular that a socalled dual certificate in the semidefinite formulation of the moment problem can be computed much faster by solving an eigenvalue problem.
Beyond the scope of this paper, future research needs to address the actual computation of the common roots of the kernel polynomials, the stable reconstruction from noisy moments, and reductions both in the number of used moments as well as in computation time.
Acknowledgment. The authors thank S. Heider for the implementation of the approach [7] for the bivariate case and H. M. Möller for several enlightening discussions. The fourth author expresses his thanks to J. Abbott for warm hospitality during his visit in Genoa and numerous useful suggestions. Moreover, we gratefully acknowledge support by the DFG within the research training group 1916: Combinatorial structures in geometry and by the Helmholtz Association within the young investigator group VHNG526: Fast algorithms for biomedical imaging.
References
 [1] F. Andersson, M. Carlsson, and M. V. de Hoop. Nonlinear approximation of functions in two dimensions by sums of exponential functions. Appl. Comput. Harmon. Anal., 29:156–181, 2010.
 [2] D. J. Bates, J. D. Hauenstein, A. J. Sommese, and C. W. Wampler. Bertini: Software for numerical algebraic geometry. Available at bertini.nd.edu with permanent doi: dx.doi.org/10.7274/R0H41PB5.
 [3] M. BenOr and P. Tiwari. A deterministic algorithm for sparse multivariate polynomial interpolation. In Proceedings of the twentieth annual ACM symposium on Theory of computing, pages 301 – 309, 1988.
 [4] T. Bendory, S. Dekel, and A. Feuer. Superresolution on the sphere using convex optimization. IEEE Trans. Signal Process., 64:2253–2262, 2015.
 [5] T. Bendory, S. Dekel, and A. Feuer. Exact recovery of Dirac ensembles from the projection onto spaces of spherical harmonics. Constructive Approximation, to appear.
 [6] E. J. Candès and C. FernandezGranda. Superresolution from noisy data. J. Fourier Anal. Appl., 19(6):1229–1254, 2013.
 [7] E. J. Candès and C. FernandezGranda. Towards a mathematical theory of superresolution. Comm. Pure Appl. Math., 67(6):906–956, 2014.
 [8] E. J. Candès, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory, 52:489–509, 2006.
 [9] Y. Chi, L. L. Scharf, A. Pezeshki, and A. R. Calderbank. Sensitivity to basis mismatch in compressed sensing. IEEE Trans. Signal Process., 59:2182–2195, 2011.
 [10] B. G. R. de Prony. Essai éxperimental et analytique: sur les lois de la dilatabilité de fluides élastique et sur celles de la force expansive de la vapeur de l’alkool, a différentes températures. Journal de l’école polytechnique, 1(22):24–76, 1795.
 [11] B. Dumitrescu. Positive trigonometric polynomials and signal processing applications. Signals and Communication Technology. Springer, Dordrecht, 2007.
 [12] F. Filbir, H. N. Mhaskar, and J. Prestin. On the problem of parameter estimation in exponential sums. Constr. Approx., 35:323–343, 2012.
 [13] M. Giesbrecht, G. Labahn, and W.s. Lee. On the equivalence between Prony’s and BenOr’s/Tiwari’s methods. University of Waterloo Tech Report, 23, 2002.
 [14] A. Gilbert, P. Indyk, M. Iwen, and L. Schmidt. Recent developments in the sparse Fourier transform: A compressed fourier transform for big data. IEEE Signal Proc. Mag., 31(5):91–100, Sept 2014.
 [15] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, New York, USA, 2nd edition, 2013.
 [16] Y. Hua and T. K. Sarkar. Matrix pencil method for estimating parameters of exponentially damped/undamped sinusoids in noise. IEEE Trans. Acoust. Speech Signal Process., 38(5):814–824, 1990.
 [17] A. Iske and B. Diederichs. Parameter estimation for bivariate exponential sums. In SampTA 2015, to appear.
 [18] T. Jiang, N. D. Sidiropoulos, and J. M. F. ten Berge. Almostsure identifiability of multidimensional harmonic retrieval. IEEE Trans. Signal Process., 49(9):1849–1859, 2001.
 [19] V. Komornik and P. Loreti. Fourier Series in Control Theory. SpringerVerlag, New York, 2004.
 [20] S. Kunis and D. Potts. Stability results for scattered data interpolation by trigonometric polynomials. SIAM J. Sci. Comput., 29:1403–1419, 2007.
 [21] Y. Mansour. Randomized interpolation and approximation of sparse polynomials. SIAM J. Comput., 24:357–368, 1995.
 [22] H. M. Möller and T. Sauer. H–bases for polynomial interpolation and system solving. Adv. in Comp. Math., 12:335–362, 2000.
 [23] T. Peter, G. Plonka, and R. Schaback. Reconstruction of multivariate signals via Prony’s method. Proc. Appl. Math. Mech., to appear.
 [24] G. Plonka and M. Tasche. Prony methods for recovery of structured functions. GAMMMitt., 37(2):239–258, 2014.
 [25] G. Plonka and M. Wischerhoff. How many Fourier samples are needed for real function reconstruction? J. Appl. Math. Comput., 42(12):117–137, 2013.
 [26] D. Potts and M. Tasche. Parameter estimation for exponential sums by approximate Prony method. Signal Processing, 90(5):1631–1642, 2010.
 [27] D. Potts and M. Tasche. Parameter estimation for multivariate exponential sums. Electron. Trans. Numer. Anal., 40:204–224, 2013.
 [28] H. Rauhut. Random sampling of sparse trigonometric polynomials. Appl. Comput. Harmon. Anal., 22:16–42, 2007.
 [29] M. Reitzner, M. Schulte, and C. Thäle. Limit theory for the Gilbert graph. Preprint, 2015.
 [30] R. Roy and T. Kailath. ESPRIT  estimation of signal parameters via rotational invariance techniques. Signal Processing, Part II, 23:369–411, 1990.
 [31] M. Rudelson and R. Vershynin. On sparse reconstruction from Fourier and Gaussian measurements. Comm. Pure Appl. Math., 61:1025–1045, 2008.
 [32] R. O. Schmidt. Multiple emitter location and signal parameter estimation. Antennas and Propagation, IEEE Transactions on, 34(3):276–280, 1986.
 [33] L. Sorber, M. Van Barel, and L. De Lathauwer. Numerical solution of bivariate and polyanalytic polynomial systems. SIAM J. Numer. Anal., 52(4):1551–1572, 2014.
 [34] H. J. Stetter. Numerical Polynomial Algebra. Society for Industrial and Applied Mathematics, Philadelphia, 2004.
 [35] M. Vetterli, P. Marziliano, and T. Blu. Sampling signals with finite rate of innovation. IEEE Trans. Signal Process., 50(6):1417–1428, 2002.