On Steepest-Descent-Kaczmarz methods for regularizing systems of nonlinear ill-posed equations
We investigate modified steepest descent methods coupled with a loping Kaczmarz strategy for obtaining stable solutions of nonlinear systems of ill-posed operator equations. We show that the proposed method is a convergent regularization method. Numerical tests are presented for a linear problem related to photoacoustic tomography and a non-linear problem related to the testing of semiconductor devices.
Keywords. Nonlinear systems; Ill-posed equations; Regularization; Steepest descent method; Kaczmarz method.
AMS Classification: 65J20, 47J06.
In this paper we propose a new method for obtaining regularized approximations of systems of nonlinear ill-posed operator equations.
The inverse problem we are interested in consists of determining an unknown physical quantity from the set of data , where , are Hilbert spaces and . In practical situations, we do not know the data exactly. Instead, we have only approximate measured data satisfying
with (noise level). We use the notation . The finite set of data above is obtained by indirect measurements of the parameter, this process being described by the model
where , and are the corresponding domains of definition.
Standard methods for the solution of system (2) are based in the use of Iterative type regularization methods [1, 7, 13, 16, 19] or Tikhonov type regularization methods [7, 23, 30, 32, 33] after rewriting (2) as a single equation , where
and . However these methods become inefficient if is large or the evaluations of and are expensive. In such a situation, Kaczmarz type methods [6, 15, 22, 26] which cyclically consider each equation in (2) separately are much faster  and are often the method of choice in practice.
The starting point of our approach is the steepest descent method [7, 29] for solving ill-posed problems. Motivated by the ideas in [10, 11], we propose in this article a loping Steepest-Descent-Kaczmarz method (l-SDK method) for the solution of (2). This iterative method is defined by
Here , are appropriate chosen numbers (see (13), (14) below), , and is an initial guess, possibly incorporating some a priori knowledge about the exact solution. The function defines a sequence of relaxation parameters and is assumed to be continuous, monotonically increasing, bounded by a constant , and to satisfy (see Figure 1).
If is an upper bound for , then (cf. Lemma 3.2). Hence the relaxation function needs only be defined on . In particular, if one chooses being constant on that interval, then and the l-SDK method reduces to the loping Landweber-Kaczmarz (l-LK) method considered in [10, 11]. The convergence analysis of the l-LK method requires , whereas the adaptive choice of the relaxation parameters in the present paper allows being much larger than .
The l-SDK method consists in incorporating the Kaczmarz strategy (with the loping parameters ) in the steepest descent method. This strategy is analog to the one introduced in  regarding the Landweber-Kaczmarz iteration. As usual in Kaczmarz-type algorithms, a group of subsequent steps (starting at some multiple of ) shall be called a cycle. The iteration should be terminated when, for the first time, all are equal within a cycle. That is, we stop the iteration at
Notice that is the smallest multiple of such that
In the case of noise free data, in (1), we choose and the iteration (4) - (7) reduces to the Steepest-Descent-Kaczmarz (SDK) method, which is closely related to the Landweber-Kaczmarz (LK) method considered in .
It is worth noticing that, for noisy data, the l-SDK method is fundamentally different from the SDK method: The bang-bang relaxation parameter effects that the iterates defined in (4) become stationary if all components of the residual vector fall below a pre-specified threshold. This characteristic renders (4) - (7) a regularization method (see Section 3). Another consequence of using these relaxation parameters is the fact that, after a large number of iterations, will vanish for some within each iteration cycle. Therefore, the computational expensive evaluation of might be loped, making the l-SDK method in (4) - (7) a fast alternative to the LK method in . Since in praxis the steepest descent method performs better than the Landweber method, the l-SDK is expected to be more efficient than the l-LK method [10, 11]. Our numerical experiments (mainly for the nonlinear problem considered in Section 5) corroborate this conjecture.
The article is outlined as follows. In Section 2 we formulate basic assumptions and derive some auxiliary estimates required for the analysis. In Section 3 we provide a convergence analysis for the l-SDK method. In Sections 4 and 5 we compare the numerical performance of the l-SDK method with other standard methods for inverse problems in photoacoustic tomography and in semiconductors respectively.
2 Assumptions and Basic Results
We begin this section by introducing some assumptions, that are necessary for the convergence analysis presented in the next section. These assumptions derive from the classical assumptions used in the analysis of iterative regularization methods [7, 16, 29].
First, we assume that the operators are continuously Fréchet differentiable, and also that there exist , , and such that
Notice that is used as starting value of the l-SDK iteration. Next we make an uniform assumption on the nonlinearity of the operators . Namely, we assume that the local tangential cone condition [7, 16]
holds for some . Moreover, we assume the existence of and element
where are the exact data satisfying (1).
In particular, for linear problems we can choose equal to 2.
In the sequel we verify some basic results that are necessary for the convergence analysis derived in the next section. The first result concerns the well-definedness and positivity of the relaxation parameter .
In the next lemma we prove an estimate for the step size of the l-SDK iteration.
The following Lemma is an important auxiliary result, which will be used at several places throughout this article.
Our next goal is to prove a monotony property, known to be satisfied by other iterative regularization methods, e.g., by the Landweber , the steepest descent , the LK , and the l-LK  method.
Proposition 2.4 (Monotonicity).
3 Convergence Analysis of the Loping Steepest Descent Kaczmarz Method
In this section we provide a complete convergence analysis for the l-SDK iteration, showing that it is a convergent regularization method in the sense of  (see Theorems 3.3 and 3.6 below). Throughout this section, we assume that (10) - (14) hold, and that , , , and are defined by (4) - (7).
Our first goal is to prove convergence of the l-SDK iteration for . For exact data , the iterates in (4) are denoted by .222This is a standard notation used in the literature.
For all , we have .
For the claimed estimate holds with equality. If , it follows from (10) that
Now the monotonicity of implies . ∎
Theorem 3.3 (Convergence for Exact Data).
For exact data, the iteration converges to a solution of (2), as . Moreover, if
From (18) it follows that decreases monotonically and therefore that converges to some . In the following we show that is in fact a Cauchy sequence.
For and with and , let be such that
Then, with , we have
For , the first two terms of (23) converge to . Therefore, in order to show that is a Cauchy sequence, it is sufficient to prove that and converge to zero as .
To that end, we write , and set . Then, using the definition of the steepest descent Kaczmarz iteration it follows that
From (11) it follows that
with . Here we made use of (21). So, we finally obtain the estimate
Because of (19), the last sum tends to zero for , and therefore . Analogously one shows that . Therefore is a Cauchy sequence and converges to an element . Because all residuals tend to zero, is solution of (2).
Proposition 3.4 (Stopping Index).
Assume . Then defined in (8) is finite, and
Using the fact that either or , we obtain
The last auxiliary result concerns the continuity of at . For , , and we define
For all ,
Moreover, , as .
We prove Lemma 3.5 by induction. The case is similar to the general case and is omitted.
In the second case, , we have and consequently
Theorem 3.6 (Convergence for Noisy Data).
To show that , we first assume that has a finite accumulation point . Without loss of generality we may assume that for all . From Proposition 3.4 we know that and, by taking the limit , that . Consequently and as