A Generalized Sampling Theorem for Stable Reconstructions in Arbitrary Bases
Abstract
We introduce a generalized framework for sampling and reconstruction in separable Hilbert spaces. Specifically, we establish that it is always possible to stably reconstruct a vector in an arbitrary Riesz basis from sufficiently many of its samples in any other Riesz basis. This framework can be viewed as an extension of that of Eldar et al. However, whilst the latter imposes stringent assumptions on the reconstruction basis, and may in practice be unstable, our framework allows for recovery in any (Riesz) basis in a manner that is completely stable.
Whilst the classical Shannon Sampling Theorem is a special case of our theorem, this framework allows us to exploit additional information about the approximated vector (or, in this case, function), for example sparsity or regularity, to design a reconstruction basis that is better suited. Examples are presented illustrating this procedure.
Keywords: Sampling Theory; Stable Reconstruction; Shannon Sampling Theorem; Infinite Matrices; Hilbert Space; Wavelets
1 Introduction
The Shannon Sampling Theorem, or the Nyquist–Shannon Sampling Theorem as it is also called (we will refer to it as the NSSampling Theorem throughout the paper), is a mainstay in modern signal processing and has become one of the most important theorems in mathematics of information [26]. The list of applications of the theorem is long, and ranges from Magnetic Resonance Imaging (MRI) to sound engineering. We will in this paper address the question on whether or not the NSSampling Theorem can be improved. In particular, given the same set of information, could one design a reconstruction of a function that would be better than that provided by the NSSampling Theorem. The answer to such a question will obviously depend on the type of functions considered. However, suppose that we have some extra information about the functions to be reconstructed. One may, for example, have information about a basis that is particularly suited for such functions. Could this information be used to improve the reconstruction given by the NSSampling Theorem, even if it is based on the same sampling procedure? Although such a question has been posed before, and numerous extensions of the NSSampling Theorem have been developed [1, 2, 9, 10, 27], the generalization we introduce in this paper is, to the best of our knowledge, a novel approach for this problem.
The well known NSSampling Theorem [18, 20, 23, 24, 28] states that if
and for some , then both and can be reconstructed from point samples of . In particular, if then
The quantity which is the largest value of such that the theorem holds, is often referred to as the Nyquist rate [23]. In practice, when trying to reconstruct or , one will most likely not be able to access the infinite amount of information required, namely, Moreover, even if we had access to all samples, we are limited by both processing power and storage to taking only a finite number. Thus, a more realistic scenario is that one will be given a finite number of samples , for some , and seek to reconstruct from these samples. The question is therefore: are the approximations
optimal for and given the information ? To formalize this question consider the following. For and , let
(1.1) 
( denotes the set of continuous functions on ). Define the mappings (with a slight abuse of notation)
(1.2) 
The question is, given a class of functions , could there exist mappings and such that
As we will see later, the answer to this question may very well be yes, and the problem is therefore to find such mappings and
As motivation for our work, consider the following reconstruction problem. Let be defined by
This is the wellknown Haar wavelet. Due to the discontinuity, there is no way one can exactly reconstruct this function with only finitely many function samples if one insists on using the mapping . We have visualized the reconstruction of using in Figure 1. In addition to not being reconstructed exactly, the approximation is polluted by oscillations near the discontinuities of . Such oscillations are indicative of the wellknown Gibbs phenomenon in recovering discontinuous signals from samples of their Fourier transforms [17]. This phenomenon is a major hurdle in many applications, including image and signal processing. Its resolution has, and continues to be, the subject of significant inquiry [25].
However, it is tempting to think that one could construct a mapping that would yield a better result. Suppose for a moment that we do not know , but we do have some extra information. In particular, suppose that we know that , where
(1.3) 
for some finite number and where are the Haar wavelets on the interval Could we, based on the extra knowledge of , construct mappings and such that
Indeed, this is the case, and a consequence of our framework is that it is possible to find and such that
provided is sufficiently large. In other words, one gets perfect reconstruction. Moreover, the reconstruction is done in a completely stable way.
The main tool for this task is a generalization of the NSSampling Theorem that allows reconstructions in arbitrary bases. Having said this, whilst the Shannon Sampling Theorem is the most standard example, the framework we develop addresses the more abstract problem of recovering a vector (belonging to some separable Hilbert space ) given a finite number of its samples with respect any Riesz basis of .
1.1 Organization of the Paper
We have organized the paper as follows. In Section 2 we introduce notation and idea of finite sections of infinite matrices, a concept that will be crucial throughout the paper, and in Section 3 we discuss existing literature on this topic, including the work of Eldar et al [7, 8, 27]. The main theorem is presented and proved in Section 4, where we also show the connection to the classical NSSampling Theorem. The error bounds in the generalized sampling theorem involve several important constants, which can be estimated numerically. We therefore devote Section 5 to discussions on how to compute crucial constants and functions that are useful for providing error estimates. Finally, in Section 6 we provide several examples to support the generalized sampling theorem and to justify our approach.
2 Background and Notation
Let denote the imaginary unit. Define the Fourier transform by
where, for vectors Aside from the Hilbert space , we now introduce two other important Hilbert spaces: namely,
and
with their obvious inner products. We will also consider abstract Hilbert spaces. In this case we will use the notation . Note that and will always denote the natural bases for and respectively. We may also use the notation for both and (the meaning will be clear from the context). Throughout the paper, the symbol will denote the standard tensor product on Hilbert spaces.
The concept of infinite matrices will be quite crucial in the theory, and also finite sections of such matrices. We will consider infinite matrices as operators from both to and to . The set of bounded operators from a Hilbert space to a Hilbert space will be denoted by . As infinite matrices are unsuitable for computations we must reduce any infinite matrix to a more tractable finitedimensional object. The standard means in which to do this is via finite sections. In particular, let
For , define to be the projection onto and, for odd , let be the projection onto . Then may be interpreted as
a finite section of . Finally, the spectrum of any operator will be denoted by
3 Connection to Earlier Work
The idea of reconstructing signals in arbitrary bases is certainly not a new idea and this topic has gone through extensive investigations in the last several decades. The papers by Unser and Aldroubi [1, 27] have been very influential and these ideas have been generalized to arbitrary Hilbert spaces by Eldar [7, 8]. The abstract framework introduced by Eldar is very powerful because of its general nature. Our framework is based on similar generalizations, yet it incorporates several key distinctions, resulting in a number of advantages. Before introducing this framework, let us first review some of the key concepts of [8].
Let be a separable Hilbert space and let be an element we would like to reconstruct from some measurements. Suppose that we are given linearly independent sampling vectors that span a subspace and form a Riesz basis, and assume that we can access the sampled inner products , . Suppose also that we are given linearly independent reconstruction vectors that span a subspace and also form a Riesz basis. The task is to obtain a reconstruction based on the sampling data . The natural choice, as suggested in [8], is
(3.1) 
where the socalled synthesis operators are defined by
and their adjoints are easily seen to be
Note that will be invertible if and only if
This gives a very nice and intuitive abstract formulation of the reconstruction. However, in practice we will never have the luxury of being able to acquire nor process the infinite amount of samples , , needed to construct . An important question to ask is therefore:

What if we are given only the first samples , ? In this case we cannot use (3.1), and we may ask: what to do?
Fortunately, there is a simple finite dimensional analog to the infinite dimensional ideas discussed above. Suppose that we are given linearly independent sampling vectors that span a subspace , and assume that we can access the sampled inner products , . Suppose also that we are given linearly independent reconstruction vectors that span a subspace . The task is to construct an approximation to based on the samples In particular, we are interested in finding coefficients (that are computed from the samples ) such that . The reconstruction suggested in [6] is
(3.2) 
where the operators are defined by
(3.3) 
and their adjoints are easily seen to be
From this it is clear that we can express as the matrix
(3.4) 
Also, is invertible if and only if and ([6, Prop. 3])
(3.5) 
Thus, to construct one simply solves a linear system of equations. The error can now conveniently be bounded from above and below by
where is the projection onto ,
is the cosine of the angles between the subspaces and and is the projection onto [6]. Note that if , then exactly, a feature known as perfect recovery. Another facet of this framework is socalled consistency: the samples , , of the approximation are identical to those of the original function (indeed, , as given by (3.2), can be equivalently defined as the unique element in that is consistent with ).
Returning to this issue at hand, there are now several important questions to ask:

What if so that is not invertible? It is very easy to construct theoretical examples such that is not invertible, however (as we will see below), such situations may very well occur in applications. In fact, is a rather strict condition. If we have that does that mean that is is impossible to construct an approximation from the samples ?

What if is large? The stability of the method must clearly depend on the quantity . Thus, even if exists, one may not be able to use the method in practice as there will likely be increased sensitivity to both roundoff error and noise.
Our framework is specifically designed to tackle these issues. But before we present our idea, let us consider some examples where the issues in (i) and (ii) will be present.
Example 3.1.
As for (i), the simplest example is to let and be the natural basis ( is the infinite sequence with in its th coordinate and zeros elsewhere). For , let the sampling vectors and the reconstruction vectors be defined by and Then, clearly, .
Example 3.2.
For an example of more practical interest, consider the following: For let , and, for odd , define the sampling vectors
(this is exactly the type of measurement vector that will be used if one models Magnetic Resonance Imaging) and let the reconstruction vectors denote the first Haar wavelets on (including the constant function, ). Let and be as in (3.3), according to the sampling and reconstruction vectors just defined. A plot of as a function of and is given in Figure 2. As we observe, for only certain values of yield stable reconstruction, whereas for the other values of the quantity grows exponentially with , making the problem severely illconditioned. Further computations suggest that increases exponentially with not just for these values of , but for all
Example 3.3.
Another example can be made by replacing the Haar wavelet basis with the basis consisting of Legendre polynomials (orthogonal polynomials on with respect to the Euclidean inner product).
In Figure 3 we plot the quantity . Unlike the previous example, this quantity grows exponentially and monotonically in . Whilst this not only makes the method highly susceptible to roundoff error and noise, it can also prevent convergence of the approximation (as ). In essence, for convergence to occur, the error must decay more rapidly than the quantity grows. Whenever this is not the case, convergence is not assured. To illustrate this shortcoming, in Figure 3 we also plot the error , where . The complex singularity at limits the convergence rate of sufficiently so that does not converge to . Note that this effect is well documented as occurring in a related reconstruction problem, where a function defined on is interpolated at equidistant pointwise samples by a polynomial of degree . This is the famous Runge phenomenon. The problem considered above (reconstruction from Fourier samples) can be viewed as a continuous analogue of this phenomenon.
Actually, the phenomenon illustrated in Examples 3.2 and 3.3 is not hard to explain if one looks at the problem from an operatortheoretical point of view. This is the topic of the next section.
3.1 Connections to the Finite Section Method
To illustrate the idea, let and be two sequences of linearly independent elements in a Hilbert space . Define the infinite matrix by
(3.6) 
Thus, by (3.4) the operator is simply the by finite section of . In particular
The finite section method has been studied extensively for the last decades [3, 12, 13, 21]. It is well known that even if is invertible then may never be invertible for any . In fact one must have rather strict conditions on for to be invertible with uniformly bounded inverse (such as positive selfadjointness, for example [21]). In addition, even if is invertible and is invertible for all it may be the case that, if
then
Suppose that and are two Riesz bases for closed subspaces and of a separable Hilbert space . Define the operators by
(3.7) 
Suppose now that exists. For , let the spaces and operators be defined as in Section 3 according to the vectors and respectively. The following scenarios may now arise:

, yet

and the inverse exists for all , but

exists for all however
for some .
Thus, in order for us to have a completely general sampling theorem we must try to extend the framework described in this section in order to overcome the obstacles listed above.
4 The New Approach
4.1 The Idea
One would like to have a completely general sampling theory that can be described as follows:

We have a signal and a Riez basis that spans some closed subspace , and
So (we may also typically have some information on the decay rate of the s, however, this is not crucial for our theory).

We have sampling vectors that form a Riez basis for a closed subspace (note that we may not have the luxury of choosing such sampling vectors as they may be specified by some particular model, as is the case in MRI) and we can access the sampling values
Goal: reconstruct the best possible approximation based on the finite subset of the sampling information .
We could have chosen vectors and defined the operators and as in (3.3) (from and ) and let be defined by (3.2). However, this may be impossible as may not be invertible (or the inverse may have a very large norm), as discussed in Examples 3.2 and 3.3. The question is then: what to do?
To deal with these issues we will launch an abstract sampling theorem that extends the ideas discussed above. To do so, we first notice that, since and are Riesz bases, there exist constants such that
(4.1) 
Now let be defined as in (3.6). Instead of dealing with we propose to choose and compute the solution of the following equation:
(4.2) 
provided a solution exists (later we will provide estimates on the size of for (4.2) to have a unique solution). Finally we let
(4.3) 
Note that, for this is equivalent to (3.2), and thus we have simply extended the framework discussed in Section 3. However, for this is no longer the case. As we later establish, allowing to range independently of is the key to the advantage possessed by this framework.
Before doing so, however, we first mention that the framework proposed above differs from that discussed previously in that it is inconsistent. Unlike (3.2), the samples do not coincide with those of the function . Yet, as we shall now see, by dropping the requirement of consistency, we obtain a reconstruction which circumvents the aforementioned issues associated with (3.2).
4.2 The Abstract Sampling Theorem
The task is now to analyze the model in (4.2) by both establishing existence of and providing error bounds for . We have
Theorem 4.1.
Let be a separable Hilbert space and be closed subspaces such that . Suppose that and are Riesz bases for and respectively with constants . Suppose that
(4.4) 
Let . Then there is an (in particular ) such that, for all , the solution to (4.2) is unique. Also, if is as in (4.3), then
(4.5) 
where
(4.6) 
The theorem has an immediate corollary that is useful for estimating the error. We have
Corollary 4.2.
With the same assumptions as in Theorem 4.1 and fixed ,
(4.7) 
In addition, if is an isometry (in particular, when are orthonormal) then it follows that
Proof of Theorem 4.1.
Let be as in as in (3.6). Then (4.4) yields the following infinite system of equations:
(4.8) 
Note that must be a bounded operator. Indeed, let and be as in (3.7). Since
it follows that . However, from (4.1) we find that both and are bounded as mappings from onto and respectively, with , , thus yielding our claim. Note also that, by the assumption that , (4.8) has a unique solution. Indeed, since it follows that , so must be injective.
Now let , then (4.8) gives us that
(4.9) 
Suppose for a moment that we can show that there exists an such that is invertible for all . Hence, we may appeal to (4.9), whence
(4.10) 
and therefore, by (4.9) and (4.1),
where
Thus, (4.5) is established, provided we can show the following claim:
Claim: There exists an such that is invertible for all . Moreover,
To prove the claim, we first need to show that is invertible for all . To see this, let denote the numerical range. Note that is selfadjoint and invertible. The latter implies that there is a neighborhood around zero such that and the former implies that the numerical range Now the spectrum Thus,
thus, is always invertible. Now, make the following two observations
(4.11) 
where the last series converges at least strongly (it converges in norm, but that is a part of the proof). The first is obvious. The second observation follows from the fact that strongly as Note that