Analysis of regularized inversion of data
corrupted by white Gaussian noise
Abstract.
Tikhonov regularization is studied in the case of linear pseudodifferential operator as the forward map and additive white Gaussian noise as the measurement error. The measurement model for an unknown function is
where is the noise magnitude. If was an function, Tikhonov regularization gives an estimate
for where is the regularization parameter. Here penalization of the Sobolev norm covers the cases of standard Tikhonov regularization () and first derivative penalty ().
Realizations of white Gaussian noise are almost never in , but do belong to with probability one if is small enough. A modification of Tikhonov regularization theory is presented, covering the case of white Gaussian measurement noise. Furthermore, the convergence of regularized reconstructions to the correct solution as is proven in appropriate function spaces using microlocal analysis. The convergence of the related finitedimensional problems to the infinitedimensional problem is also analysed.
Key words and phrases:
Keywords: Regularization, inverse poblem, white noise, pseudodifferential operatorContents
1. Introduction
1.1. Discrete and continuous regularization
Consider the following continuous model for indirect measurements:
(1.1) 
where the data and the quantity of interest are realvalued functions of real variables and is a bounded linear operator. A large class of practical measurements can be modelled by operators arising from partial differential equations of mathematical physics. We focus on illposed inverse problems where does not have a continuous inverse.
Physical measurement devices produce a discrete data vector , which we model by adding a linear operator to (1.1):
(1.2) 
Furthermore, practical solution of the inverse problem calls for a discrete representation of the unknown . This can be done using some computationally feasible approximation of the form , for example Fourier series truncated to terms. The practical inverse problem is now
(1.3) 
We study the most common computational appoach to (1.3), namely classical Tikhonov regularization defined by
(1.4) 
Here is a matrix approximation to the operator , and is the regularization parameter. The matrix is used to introduce a priori information to the inversion. For example,

, the identity matrix, models the a priori information that is not very large in norm.

, where is a finitedifference firstorder derivative matrix, models the a priori information that is continuously differentiable and or its derivative are not very large in square norm.
Our aim is to provide new analytic insight to the relationship between the continuous model (1.1) and practical inversion based on (1.4) in the case of Gaussian noise.
Note that the reconstruction given by (1.4) depends on both and . Practical computational inversion may involve modifying both of them: updating the measurement device changes the number of data points, and refining the computational grid in the hope of extra accuracy will increase . Furthermore, sometimes the most efficient numerical algorithm is based on a multigrid strategy, involving the computation of with several different .
Since there is a common continuous inverse problem behind the discrete model, it is desirable that the reconstruction converges to a meaningful limit as . Such convergence would also ensure that the dependency of on and is stable, at least for large enough values. Therefore, we discuss a continuous version of (1.4) based directly on the ideal model (1.1).
Under certain assumptions (including that should be an function) the finitedimensional problem (1.4) converges as to the following infinitedimensional minimization problem in a Sobolev space :
(1.5) 
See Section 5 below for a proof. In (1.5) the case corresponds to (a) and corresponds, roughly, to (b) above. However, formula (1.5) only makes sense if the noise in (1.1) is square integrable. This brings us to the main topic of the paper: noise modeling.
1.2. Properties of white noise
Next we will give the definitions for the discrete and continuous white noise and describe the ’white noise paradox’ arising from the infinite norm of the natural limit of white Gaussian noise in when .
We model the dimensional noise in (1.2) as , where plays the role of noise amplitude. The vector is a realization of a valued Gaussian random variable having mean zero and unit variance: . In terms of a probability density function we have
(1.6) 
The appearance of in (1.6) is the reason why square norm is used in the data fidelity term of (1.4). The above noise model is appropriate for example for photon counting under high radiation intensity, see e.g. [27, 48].
Let us relate the above to the continuous model (1.1). We take and to be functions defined on a closed, compact dimensional manifold , and the operator to be a pseudodifferential operator (DO). Furthermore, the noise in (1.1) is modelled as , where is the noise amplitude and is a realization of normalised Gaussian white noise .
Rigorous treatment of white noise on is based on generalized functions (distributions). We denote the pairing of a distribution and a test function by . Let be a probability space. A random generalized function , where and , on is a measurable map . Below, following the tradition used in study of stochastical processes, we often omit the variable and just denote a random generalized function by .
White noise is a random generalized function on such that the inner products are Gaussian random variables for all , , and
(1.7) 
The covariance operator of Gaussian white noise is the identity operator. Then can be considered as a function where is the probability space. A realization of is the generalized function on with a fixed .
Below, we consider the case when
(1.8) 
where are such that is an orthogonal basis in . Then with e as above. For example, when is a dimensional torus, can be the truncation of the Fourier series. See Section 5 for a detailed discussion on discrete and continuous noise models.
Now we can state the main motivation behind this study. The probability density function of is often formally written in the form
(1.9) 
However, despite formula (1.9), the realizations of the white Gaussian noise are almost surely not in . Thus we cannot use formula (1.5) when the error in the measurement is white Gaussian noise. Let us illustrate this “white noise paradox” by a simple example.
Example 1. Let be normalized Gaussian white noise defined on the dimensional torus . The Fourier coefficients of are normally distributed with variance one, that is, , where and . Hence
This implies that with probability zero. However, when
(1.10) 
and hence takes values in almost surely (that is, with probability one).
On the other hand [46, Theorem 2] implies that if almost surely then which yields . This concludes that the realisations of white noise are almost surely in the space
if and only if . In particular for the function is in only when where .
Even though the previous example is proven in we note that the same result is valid in all open bounded subsets .
1.3. Main result
Let us again consider a general closed dimensional Riemannian manifold and let be the Laplace operator on . Furthermore, let be a pseudodifferential operator. Consider the following measurement model:
(1.11) 
where with is a realization of white noise.
The pseudodifferential operator can be, for example,
where and in an open neighbourhood of the diag, we have
where is a distance function, , and . In this case is a pseudodifferential operator of order .
Let us now modify formula (1.5) to arrive at something useful for white Gaussian noise. Expand the data fidelity term like this: . Simply omitting the “constant term” leads to the definition
(1.12) 
where we can interpret as a suitable duality pairing instead of inner product. When is a pseudodifferential operator of order , we can define .
It is wellknown that the solution of the finitedimensional problem (1.4) can be calculated using the following formula:
(1.13) 
The regularized solution of the continuous problem (1.12) is
(1.14) 
The regularization parameter is chosen to be a function of the noise amplitude: where is a constant and . We will now formulate the main theorem of this paper, concerning the continuous regularized solution (1.14).
Theorem 1.
Let be a dimensional closed manifold and with . Here . Let with some and consider the measurement
(1.15) 
where , is an elliptic pseudodifferential operator of order on the manifold with and . Assume that is injective. The regularization parameter is chosen to be where is a constant and .
Take . Then the following convergence takes place in norm:
Furthermore, we have the following estimates for the speed of convergence:

If then

If then
Above we have .
Notice that in case (i) and in case (ii) . The different convergence speeds (i) and (ii) show the tradeoff between smoothness of the space and the speed of convergence. In case (i) we get better convergence rates but in case (ii) we can use a stronger norm. In section 4 we give two counterexamples to show that even though and the regularized solution does not converge to the real solution in norm.
1.4. Literature review
There are two main ways in inverse problems literature for modelling noise. The first approach based on the deterministic regularization techniques is to assume that the noise is deterministic and small. In that case one has a norm estimate of the noise and can study what happens when . This approach was originated by Tikhonov [51, 52], and studied in depth in [5, 11, 17, 24, 42, 40, 53]. The second approach to handling the noise is based on statistical point of view. The statistical modeling of noise in the inverse problems started in the early papers of [14, 15, 49, 50] and it is notable that with this approach one needs not assume smallness of the noise. For some recent references of the frequentist view of statistical problems see [3, 19, 35, 39]. Another statistical way to study inverse problems with random noise is based on Bayesian approach where and are considered to be realizations of random variables, see [6, 18, 22, 28, 29, 30, 31, 32, 33, 43, 44, 45].
The deterministic regularization and statistical approaches differ both in assumptions and techniques. This paper aims to bridge the gap between them. Our results are closely related to earlier studies of Eggermont, LaRiccia, and Nashed [8, 9, 10], who studied weakly bounded noise. They assume that the noise is a function and discuss regularization techniques when the noise tends to zero in the weak topology of . This kind of relaxed assumption of noise covers small low frequency noise and large high frequency noise. However, even though tends to zero in weak sense as when is a realization of the normalized white noise, this type of noise lies outside the definition of the weakly bounded noise as is not almost surely valued.
A related approach of smoothing the noise before the analysis is described in [37, 38]. A similar regularization method where no smoothness of the operator is assumed, but instead the regularization method is modified, is studied in [7]. Another possible approach to deal with white noise is to first perform a data projection step and then proceed to Tikhonov regularization [26, 25]. Also, Hohage and Werner have earlier studied inverse problems taking into account the fact that white noise is not squareintegrable in [20].
Our new results are different from all of those previous studies. Our approach aims to study the effect of the continuous white noise not being an function in Tikhonov regularisation (1.5) instead of modifying the problem by altering the regularisation method or assumptions.
2. Analysis of the translationinvariant case
Before giving the general proof of Theorem 1 in Section 3 we motivate the proof by proving a similar kind of lemma for translationinvariant case.
The regularized solution we are studying is of the form
where , for some constant and . As mentioned before solution to this is
(2.1) 
Let us consider the case when , where is a constant, and is an elliptic pseudodifferential operator of order that commutes with translations. Then, in we have that . As and commute with translations they are Fourier multipliers,
and since is elliptic there is so that
The symbol of is
and thus
If and
Now when we have
Thus writing
(2.2) 
we see that
where
Here,
and thus if by dominated convergence theorem
Above the limit speed of convergence can be analysed using the standard regularization theory [11] and the fact that
We can use the fact that and write
We also have the inequality . When we can define and so that, , . We get
Hence we obtain
On the other hand,
Hence
where . Because we proved the convergence of in we have to have . This is true at least when . Thus adding the above results together we can formulate the next lemma.
Lemma 2.
Let , , be Gaussian distributed, , , and
where , , is an elliptic pseudodifferential operator of order that commutes with translations. We assume that . Then for the regularized solution
of we have
where and . Furthermore we have the following estimate of the speed of convergence
Proof. The convergence is immediate consequence of the above results. For the convergence speed we get
where and .
3. Proof of the main theorem
Here we study the general case where is an elliptic pseudodifferential operator of order . We denote and where is a closed manifold and . As in the previous example we have
(3.1) 
where .
First we will show that is invertible. We define as the adjoint of an operator . We assume that is onetoone. If then
which implies and furthermore . Thus the operator is onetoone.
Next we recall the fact that an elliptic operator is a Fredholm operator and ([21] Theorem 19.2.1). Indeed index of a Fredholm operator is
(3.2) 
If is compact and is the adjoint of the operator then
Define as an extension of and show that for all . Define
We can write where is compact. Now
Because is compact we can write
and hence we see that . Using this, the knowledge that is onetoone and (3.2) we get
which means that is also onto. Thus we have shown that there exist .
Next we will examine DOs that depend on spectral variable . For the general theory see [47]. The symbol class consist of the functions such that

for every fixed and

for arbitrary multiindices and and for any compact set there exist constants such that
for , and .
We consider the pseudodifferential operators depending on the parameter . To define such operators, one considers local coordinates of the manifold , where we emphasize that the set does not need to be connected (see [47, Sect. I.4.3]). A bounded linear operator , depending on the parameter , is a pseudodifferential operator with spectral variable if for any local coordinates of manifold , , there is a symbol such that for we have
where . In this case we will write
and say that in local coordinates the operator has the symbol . If for all compact sets there are constants such that the symbol satisfies
for and , we say that is hypoelliptic with parameter and denote . We will denote by the class of DOs depending on the parameter whose symbol in all local coordinates belongs in
We want to prove that
is invertible. Operator is elliptic since is elliptic and . Denote and its symbol . Then for the symbol of the operator we have in compact subsets of any local coordinates
By ([47] Theorem 9.2.) there exist such that for the operator is invertible with
Now we have shown that the operator can be rewritten
(3.3) 
where .
We denote by the norm of where . We have the following norm estimates for when and large enough
(3.4) 