On the d-dimensional Quasi-EquallySpaced Sampling

On the -dimensional Quasi-Equally
Spaced Sampling

Alessandro Nordio, Carla-Fabiana Chiasserini, Emanuele Viterbo
Dipartimento di Elettronica, Politecnico di Torino
C. Duca degli Abruzzi 24, I-10129 Torino, Italy
Phone: +39 011 090 4226; Fax +39 011 0904099
E-mail: {alessandro.nordio,chiasserini}@polito.it
DEIS, Università della Calabria
Via P. Bucci, Cubo 42C, 87036 Rende (CS), Italy
Phone: +39 0984 494778; Fax +39 0984 494713
E-mail: viterbo@deis.unical.it
Abstract

We study a class of random matrices that appear in several communication and signal processing applications, and whose asymptotic eigenvalue distribution is closely related to the reconstruction error of an irregularly sampled bandlimited signal. We focus on the case where the random variables characterizing these matrices are -dimensional vectors, independent, and quasi-equally spaced, i.e., they have an arbitrary distribution and their averages are vertices of a -dimensional grid. Although a closed form expression of the eigenvalue distribution is still unknown, under these conditions we are able (i) to derive the distribution moments as the matrix size grows to infinity, while its aspect ratio is kept constant, and (ii) to show that the eigenvalue distribution tends to the Marčenko-Pastur law as . These results can find application in several fields, as an example we show how they can be used for the estimation of the mean square error provided by linear reconstruction techniques.

EDICS: DSP-RECO Signal reconstruction, DSP-SAMP Sampling, SPC-PERF Performance analysis and bounds.

I Introduction

Consider the class of random matrices of size , with entries given by

(1)

The generic element of can be written as: , , , where are independent random variables characterized by a probability density function (pdf) , with . These matrices are Vandermonde matrices with complex exponential entries; they appear in many signal/image processing applications and have been studied in a number of recent works, (see e.g., [1, 2, 3, 4, 5, 6, 7, 8]). More specifically, in the field of signal processing for sensor networks, [1, 2] studied the performance of linear reconstruction techniques for physical fields irregularly sampled by sensors. In such scenario, the random variables in (1) represent the coordinates of the sensor nodes. The work in [3] addressed the case where these coordinates are uniformly distributed and subject to an unknown jitter. In the field of communications, the study in [8] presented a number of applications where these matrices appear, which range from multiuser MIMO systems to multifold scattering.

In spite of their numerous applications, few results are known for the Vandermonde matrices in (1). In particular, a closed form expression for the eigenvalue distribution of the Hermitian Toeplitz matrix , as well as its asymptotic behavior, would be of great interest. As an example, in [1, 2, 6], it has been observed that the performance of linear techniques for reconstructing a signal from a set of irregularly-spaced samples with known coordinates is a function of the asymptotic eigenvalue distribution of . The asymptotic eigenvalue distribution of is defined as the distribution of its eigenvalues, in the limit of and growing to infinity while their ratio is kept constant. Unfortunately, such distribution is still unknown.

In this work, we consider a general formulation which extends the model in to the -dimensional domain. We study the properties of random matrices of size and entries given by

(2)

where the vectors have independent entries, characterized by the pdf , , , and is the number of dimensions. The invertible function

(3)

maps the vector of integers , onto a scalar index, i.e., the row index of the matrix . Notice that, when , reduces to (1).

For the matrix model in (2), we study the interesting case where are independent, quasi-equally spaced random variables in the -dimensional hypercube . In other words, we assume that the averages of are the vertices of a -dimensional grid in . This is often the case arising in measurement systems affected by jitter, or in sensor network deployments where the sensors sampling the physical field can only be roughly placed at equally spaced positions, due to terrain conditions and deployment practicality [9]. Note that the distribution of the random variables can be of any kind, the only assumption we make is on their averages being equally spaced. Since an analytic expression of the eigenvalue distribution of is unknown, we derive a closed form expression for its moments. This enables us to show that, as , the eigenvalue distribution tends to the Marčenko-Pastur law [12]. At the end of the paper, we present some numerical results and applications where the moments and the asymptotic approximation to the eigenvalue distribution of can be of great use.

Ii Previous Results and Problem Formulation

As a first step, we briefly review previous results on the matrices. In a one-dimensional domain (), the work in [1] considered an irregularly sampled bandlimited signal, which is reconstructed using linear techniques and assuming the samples coordinates to be known. The performance of the reconstruction system was derived as a function of the eigenvalue distribution of the matrix , where is the aspect ratio111The aspect ratio of is the ratio between the number of rows and the number of columns of the matrix of  [1, 2]. An explicit expression of the moments

was attained in [4, 5], for the specific case where are uniformly distributed in . Also, in the case where are independent, quasi-equally spaced random variables, the analytic expression of the second moment of the eigenvalue distribution of , i.e., , was obtained in [3]. Then, in [7] the moments were derived for an arbitrary distribution .

In [4, 5], the -dimensional model (2) was also investigated. There, the properties of the random matrices were studied in the case where the vectors have independent entries, uniformly distributed in the hypercube . Under such assumptions, and for given and aspect ratio , an analytic expression of the moments of was derived and it was shown that, as , tends to the Marčenko-Pastur law [12], i.e.,

where , , .

The following sections detail the problem addressed in this work and introduce some useful notations.

Ii-a The quasi-equally spaced multidimensional model

We consider the matrix class in (2) and assume that the vectors are independent, quasi-equally spaced random variables in the -dimensional hypercube , i.e., the averages of are the vertices of a -dimensional grid in .

We define as the number of vertices per dimension, thus the total number of vertices is . We denote the coordinate of a generic vertex of the grid by the vector , where , is an integer vector and . For notation simplicity and in analogy with (3), we identify the vertex with coordinate by the scalar index

(4)

Notice that is an invertible function and allows us to write

where the average

is the coordinate of the sample identified by the scalar label and 1 is the all ones vector. Furthermore, we assume that the entries of the vectors are i.i.d. with pdf which does not depend on , , or . By using this notation, the entries of are then given by

(5)

while the aspect ratio is

(6)

The Hermitian Toeplitz matrix is defined as

(7)

where represents a -dimensional sum over all vectors such that , .

Our goals are (i) to derive the analytic expression of the moments of with quasi-equally spaced vectors (Section III), and (ii) to show that as , tends to the Marčenko-Pastur law (Section IV).

Iii Closed form expression of the moments of the asymptotic eigenvalue pdf

Following the approach adopted in [13, 14], in the limit for and growing to infinity with constant aspect ratio and dimension , we compute the closed form expression of , which can be obtained from the powers of as [15],

(8)

In (8) the symbol identifies the matrix trace operator, and the average is computed over the set of random variables . An efficient method to compute (8) exploits set partitioning. Indeed, note that the power is the matrix product of copies of . This operation yields exponential terms, whose exponents are given by a sum of terms of the form (see also (22) in Appendix A). The average of this sum depends on the number of distinct vectors , and all possible cases can be described as partitions of the set . In particular, the case where in the set there are distinct vectors, corresponds to a partition of in subsets. It follows that a fundamental step to calculate (8) is the computation of all possible partitions of set . Before proceeding further in our analysis, we therefore introduce some useful definitions related to set partitioning.

Iii-a Definitions

Let the integer denote the moment order and let the vector be a possible combination of integers. In our specific case, each entry of the vector is given by the expression in (4), i.e., and, thus, can range between and .

We define:

  • the scalar integer as the number of distinct entries of the vector ;

  • as the vector of integers, of length , whose entries , , are the entries of without repetitions, in order of appearance within ;

  • as the set of indices of the entries of with value , ;

  • the vector such that, for any given , we have if , .

      Example 1: Let , then since the entries of take 5 distinct values (i.e., ). Such values, taken in order of appearance in form the vector . The value appears at position in , therefore . The value appears at positions 2 and 5 in , therefore . Similarly , , and . By using the sets we build the vector, . For each we assign the value to every such that . For example, since the integers 2 and 5 are in . In conclusion .  

    Furthermore, we define:

  • as the set of partitions of ;

  • as the set of partitions of in subsets, , with .

Note that: (i) the cardinality of , denoted by , is the -th Bell number [16] and (ii) the cardinality of , denoted by , is a Stirling number of the second kind [17].

From the above definitions, it follows that:

  1. the vector induces a partition of the set which is identified by the subsets . These subsets have the following properties

    Even though the partition identified by is often represented as , by its definition, an equivalent representation of such partition is given by the vector . Therefore, from now on we will refer to as a partition of the element set induced by (for simplicity, however, often we will not explicit the dependency of on );

  2. , since the entries of take all possible values in the set ;

  3. , for .

At last, we define as the set of inducing the same partition of .

  Example 2: Let and . Since and , , we have possible vectors , namely, . Each identifies a partition , with , as described in Example 1. The sets of partitions , are given by , , and , and have cardinality , and , respectively. The set of vectors identifying the partition , i.e., , is given by: . Similarly,  

Iii-B Closed form expression of

By using the definitions in Section III-A and by applying set partitioning to (8), we can state the first main result of this work:

Theorem III.1

Let be a Hermitian random matrix as defined in (7), where the properties of the random vectors are described in Section II-A. Then, for any given and , the -th moment of the asymptotic eigenvalue distribution of is given by:

(9)

where

(10)
(11)

and for . In (11), we defined as the -dimensional hypercube , as the characteristic function of , as the Dirac’s delta, and

, , and .

Proof:

The proof can be found in Appendix A. \qed

With the aim to give an intuitive explanation of the above expressions, note that the right hand side of (9) counts all possible partitions of the set , in (11) accounts for the generic distribution of the variables , and the quantity represents the indices pairing that appears in the exponent of the generic entry of the power .

To further clarify the moments computation, Table I reports an example of partition sets for and , while Example 3 shows the computation of the second moment of the eigenvalue distribution.

[1], 1
[1,1], -1 [1,2], 1
[1,1,1], 2
[1,1,2], -1
[1,2,1], -1
[1,2,2], -1
[1,2,3], 1
TABLE I: Partition sets for , and . Each partition is represented through its associated vector and the value of
  Example 3: We compute the analytic expression of . Using (9), we get: By expanding this expression and using Table I, we obtain We notice that, for , . The term refers instead to the case , and it is given by with and . It follows that Finally, Thus, we write  

Iv Convergence to the Marčenko-Pastur distribution

In this section we show that the asymptotic eigenvalue distribution of the matrix tends to the Marčenko-Pastur law [12], as . This is equivalent to prove that, as , the -th moment of tends to the -th moment of the Marčenko-Pastur distribution with parameter , for every .

Theorem IV.1

Let be a Hermitian random matrix as defined in (7), where the properties of the random vectors are described in Section II-A. Let be the -th moment of the asymptotic eigenvalue distribution of , given by Theorem III.1. Then, for any given ,

(12)

where are the Narayana numbers [18, 19] and are the Narayana polynomials, i.e., the moments of the Marčenko-Pastur distribution [12].

Proof:

We first look at the expression of the -th asymptotic moment and observe that, for , the contribution of the term in the right hand side of (9) reduces to

(13)

The cardinality of is and . Thus, we only consider . Moreover, using (10) we have since each subset has cardinality , . Therefore, the term in (13) becomes

Using (11) with , we have:

(14)

Hence, the contribution to the -th moment reduces to

(15)

In [4, 5] it is shown that, as ,  (15) tends to the Narayana polynomial of order . It follows that, in order to prove the theorem, it is enough to show that for the contribution of the term in the right hand side of (9), to the expression of the -th asymptotic moment, vanishes as . In practice we have to show that, for each and , with ,

or, equivalently, that .

We first notice that for

(16)

Moreover, we have:

(17)

The equality arises if the condition is always verified, otherwise, if , .

Next, we make the following observations: (i) since we consider partitions of the form in subsets with , then at least one of the sets has cardinality ; (ii) the term

gives a non-zero contribution to the integral in (16) only when . Hence, if for some , then some will provide a non-zero contribution to the integral in (16). In this case, we can write

(18)

which proves the claim.

When , again, there is a measurable subset of for which , hence,

i.e., the strict inequality holds.

\qed

Fig. 1: Comparison between the Marčenko-Pastur distribution and the empirical distribution obtained for and in the quasi equally space case, and uniform

In Figure 1, we show the empirical eigenvalue distribution of the matrix for , , and uniformly distributed in . The empirical distribution is compared to the Marčenko-Pastur distribution (solid line). We observe that as, increases, the Marčenko-Pastur distribution law becomes a good approximation of . In particular, the two curves are relatively close for small , already for .

V Applications

Here we present some applications where the results derived in this work can be used.

The closed form expression of the moments of , given by (34), can be a useful basis for performing deconvolution operations, as proposed in [8]. As for the asymptotic approximation, we show below how to exploit our results for the estimation of the MSE provided by linear reconstruction techniques of irregularly sampled signals.

Let us assume a general linear system model affected by additive noise. For simplicity, consider a one-dimensional signal, . When observed over a finite interval, it admits an infinite Fourier series expansion [1, 2]. We can think of the largest index of the non-negligible Fourier coefficients of the expansion as the approximate one-sided bandwidth of the signal. We therefore represent by using complex harmonics as

(19)

Now, consider that the signal is observed within one period interval and sampled in points placed at positions , , . The complex numbers represent amplitudes and phases of the harmonics in . The signal samples can be written as , where the matrix is given in (1). The signal discrete spectrum is given by the complex vector . We can now write the linear model for a measurement sample vector taken at the sampling points

(20)

where is a random vector representing measurement noise. The general problem is to reconstruct or given the noisy measurements [4, 5]. A commonly used parameter to measure the quality of the estimate of the reconstructed signal is the mean square error (MSE). In [1, 2, 3] it has been shown that, when linear reconstruction techniques are used and the sample coordinates are known, the asymptotic MSE (i.e., as the number of harmonics and the number of samples tend to infinity while their ratio is kept constant) is a function of the asymptotic eigenvalue distribution of the matrix , i.e.,

(21)

where the random variable has distribution and SNR is the signal-to-noise ratio on the measure. We therefore exploit our asymptotic approximation to to compute (21).

Fig. 2: MSE as a function of the signal-to-noise ratio for . The curves are compared with the results obtained through our asymptotic analysis (MP) and with the equally spaced case

Fig. 3: MSE as a function of the signal-to-noise ratio for . The curves are obtained for and compared against both the equally spaced case and the results derived through our asymptotic analysis (MP)

Figure 2 shows the MSE obtained as a function of the signal-to-noise ratio SNR. The curves with markers labeled by “” refer to the cases where the signal has dimension and the sampling points are quasi-equally spaced with jitter , uniformly distributed over , and . The curve labeled by “MP” (thick line) reports the results derived through our asymptotic () approximation to the eigenvalue distribution, while the curve labeled by “Equally spaced” (dashed line) represents the MSE achieved under a perfect equally spaced sample placement, i.e., when the eigenvalue distribution is given by . Notice that the MSE grows as increases and tends to the MSE obtained by a Marčenko-Pastur eigenvalue distribution. Instead, as expected, the “Equally spaced” curve represents a lower bound to the system performance.

Figure 3 presents similar results but obtained for and different values of . We observe that the MSE obtained through our asymptotic approximation (the curve labeled by “MP”) gives excellent results for values of as small as 0.2, even when compared against the numerical results derived by fixing . For (i.e., when the ratio of the number of signal harmonics to the number of samples increases), the approximation becomes slightly looser, and the MSE computed by using the Marčenko-Pastur distribution gives an upper limit to the quality of the reconstructed signal. Note that the smaller the , the higher the oversampling rate relative to the equally spaced minimal sampling rate . We thus observe how our bound becomes tighter as the oversampling rate increases.

To conclude, we describe some areas in signal processing where the above system model and results find application.

  • Spectral estimation with noise. Spectral estimation from high precision sampling and quantization of bandlimited signals uses measurement systems which are usually affected by jitter [20]. In such applications the quantization noise corresponds to the measurement noise and the jitter is caused by the limited accuracy of the timing circuits. In this case the sampling points are mismatched with respect to the nominal values, thus for we have: with some sampling rate . Note that the exact positions of the samples are not known and the case studied in this paper (i.e., MSE with exact positions) gives a lower bound to the reconstruction error.

  • Signal reconstruction in sensor networks. Sensor networks, whose nodes sample a physical field, like air temperature, light intensity, pollution levels or rain falls, typically represent an example of quasi-equally spaced sampling [3, 9, 21, 22]. Indeed, often sensors are not regularly deployed in the area of interest due to terrain conditions and deployment practicality and, thus, the physical field is not regularly sampled in the space domain. Sensors report the data to a common processing unit (or sink node), which is in charge of reconstructing the sensed field, based on the received samples and on the knowledge of their coordinates. If the field can be approximated as bandlimited in the space domain, then an estimate of the discrete spectrum can be obtained by using linear reconstruction techniques [23, 3], even in presence of additive noise. In this case, our approximation allows to compute the MSE on the reconstructed field.

  • Stochastic sampling in computer graphics and image processing. Jittered sampling was first examined by Balakrishnan in [24], who analyzed it as an undesirable effect in sampling continuous time functions. More than twenty years later, Cook [25] realized that the effect of stochastic sampling can be advantageous in computer graphics to reduce aliasing artifacts, and considered jittering a regular grid as an effective sampling technique. Another example of sampling with jitter was recently proposed in [26], for robust authentication of images.

Vi Conclusions

We studied the behavior of the eigenvalue distribution of a class of random matrices, which find large application in signal and image processing. In particular, by using asymptotic analysis, we derived a closed-form expression for the moments of the eigenvalue distribution. Using these moments, we showed that, as the signal dimension goes to infinity, the asymptotic eigenvalue distribution tends to the Marčenko-Pastur law. This result allowed us to obtain a simple and accurate bound to the signal reconstruction error, which can find application in several fields, such as jittered sampling, sensor networks, computer graphics and image processing.

Appendix A Proof of Theorem iii.1

Using (7), the term in (8) can be written as:

(22)

where and are sets of integer matrices such that

and

A-a Set partitioning

We now apply the definitions in Section III-A in order to rewrite (22) using set partitioning. In particular by considering the vector where and is the -th column of , we observe that:

  • the vector is uniquely defined by , and a given uniquely defines a matrix since is an invertible function;

  • a given induces a partition ;

  • since is the number of values that the entries can take, there exist matrices generating a given partition of made of subsets. In other words distinct ’s yield the same partition .

Since the random vectors and are independent for , for any given the average operator in (22) factorizes into terms, i.e.,