On Steepest-Descent-Kaczmarz methods for regularizing systems of nonlinear ill-posed equations

# On Steepest-Descent-Kaczmarz methods for regularizing systems of nonlinear ill-posed equations

## Abstract

We investigate modified steepest descent methods coupled with a loping Kaczmarz strategy for obtaining stable solutions of nonlinear systems of ill-posed operator equations. We show that the proposed method is a convergent regularization method. Numerical tests are presented for a linear problem related to photoacoustic tomography and a non-linear problem related to the testing of semiconductor devices.

Keywords. Nonlinear systems; Ill-posed equations; Regularization; Steepest descent method; Kaczmarz method.

AMS Classification: 65J20, 47J06.

## 1 Introduction

In this paper we propose a new method for obtaining regularized approximations of systems of nonlinear ill-posed operator equations.

The inverse problem we are interested in consists of determining an unknown physical quantity from the set of data , where , are Hilbert spaces and . In practical situations, we do not know the data exactly. Instead, we have only approximate measured data satisfying

 ∥yδi−yi∥≤δi,  i=0,…,N−1, (1)

with (noise level). We use the notation . The finite set of data above is obtained by indirect measurements of the parameter, this process being described by the model

 Fi(x)=yi,  i=0,…,N−1, (2)

where , and are the corresponding domains of definition.

Standard methods for the solution of system (2) are based in the use of Iterative type regularization methods [1, 7, 13, 16, 19] or Tikhonov type regularization methods [7, 23, 30, 32, 33] after rewriting (2) as a single equation , where

 F:=(F0,…,FN−1):N−1⋂i=0Di→YN (3)

and . However these methods become inefficient if is large or the evaluations of and are expensive. In such a situation, Kaczmarz type methods [6, 15, 22, 26] which cyclically consider each equation in (2) separately are much faster [24] and are often the method of choice in practice.

For recent analysis of Kaczmarz type methods for systems of ill-posed equations, we refer the reader to [4, 10, 11, 17].

The starting point of our approach is the steepest descent method [7, 29] for solving ill-posed problems. Motivated by the ideas in [10, 11], we propose in this article a loping Steepest-Descent-Kaczmarz method (l-SDK method) for the solution of (2). This iterative method is defined by

 xδk+1=xδk−ωkαksk, (4)

where

 sk :=F′[k](xδk)∗(F[k](xδk)−yδ[k]), (5) ωk :={1∥F[k](xδk)−yδ[k]∥≥τδ[k]0otherwise, (6) αk :={Φrel(∥sk∥2/∥F′[k](xδk)sk∥2)ωk=1αminωk=0. (7)

Here , are appropriate chosen numbers (see (13), (14) below), , and is an initial guess, possibly incorporating some a priori knowledge about the exact solution. The function defines a sequence of relaxation parameters and is assumed to be continuous, monotonically increasing, bounded by a constant , and to satisfy (see Figure 1).

{psfrags}\psfrag

a \psfragrelax

If is an upper bound for , then (cf. Lemma 3.2). Hence the relaxation function needs only be defined on . In particular, if one chooses being constant on that interval, then and the l-SDK method reduces to the loping Landweber-Kaczmarz (l-LK) method considered in [10, 11]. The convergence analysis of the l-LK method requires , whereas the adaptive choice of the relaxation parameters in the present paper allows being much larger than .

The l-SDK method consists in incorporating the Kaczmarz strategy (with the loping parameters ) in the steepest descent method. This strategy is analog to the one introduced in [11] regarding the Landweber-Kaczmarz iteration. As usual in Kaczmarz-type algorithms, a group of subsequent steps (starting at some multiple of ) shall be called a cycle. The iteration should be terminated when, for the first time, all are equal within a cycle. That is, we stop the iteration at

 kδ∗:=argmin{lN∈N:xδlN=xδlN+1=⋯=xδlN+N−1}, (8)

Notice that is the smallest multiple of such that

 xkδ∗=xkδ∗+1=⋯=xkδ∗+N−1. (9)

In the case of noise free data, in (1), we choose and the iteration (4) - (7) reduces to the Steepest-Descent-Kaczmarz (SDK) method, which is closely related to the Landweber-Kaczmarz (LK) method considered in [17].

It is worth noticing that, for noisy data, the l-SDK method is fundamentally different from the SDK method: The bang-bang relaxation parameter effects that the iterates defined in (4) become stationary if all components of the residual vector fall below a pre-specified threshold. This characteristic renders (4) - (7) a regularization method (see Section 3). Another consequence of using these relaxation parameters is the fact that, after a large number of iterations, will vanish for some within each iteration cycle. Therefore, the computational expensive evaluation of might be loped, making the l-SDK method in (4) - (7) a fast alternative to the LK method in [17]. Since in praxis the steepest descent method performs better than the Landweber method, the l-SDK is expected to be more efficient than the l-LK method [10, 11]. Our numerical experiments (mainly for the nonlinear problem considered in Section 5) corroborate this conjecture.

The article is outlined as follows. In Section 2 we formulate basic assumptions and derive some auxiliary estimates required for the analysis. In Section 3 we provide a convergence analysis for the l-SDK method. In Sections 4 and 5 we compare the numerical performance of the l-SDK method with other standard methods for inverse problems in photoacoustic tomography and in semiconductors respectively.

## 2 Assumptions and Basic Results

We begin this section by introducing some assumptions, that are necessary for the convergence analysis presented in the next section. These assumptions derive from the classical assumptions used in the analysis of iterative regularization methods [7, 16, 29].

First, we assume that the operators are continuously Fréchet differentiable, and also that there exist , , and such that

 ∥F′i(x)∥≤M, x∈Bρ(x0)⊂N−1⋂i=0Di. (10)

Notice that is used as starting value of the l-SDK iteration. Next we make an uniform assumption on the nonlinearity of the operators . Namely, we assume that the local tangential cone condition [7, 16]

 ∥Fi(x)−Fi(¯x)−F′i(x)( x−¯x)∥Y (11) ≤η∥Fi(x)−Fi(¯x)∥Y,x,¯x∈Bρ(x0)

holds for some . Moreover, we assume the existence of and element

 x∗∈Bρ/2(x0) such that F(x∗)=y. (12)

where are the exact data satisfying (1).

We are now in position to choose the positive constants , in (7), (6). For the rest of this article we shall assume

 αmin:=Φrel(1/M2), (13) τ≥21+η1−2η≥2. (14)

In particular, for linear problems we can choose equal to 2.

In the sequel we verify some basic results that are necessary for the convergence analysis derived in the next section. The first result concerns the well-definedness and positivity of the relaxation parameter .

###### Lemma 2.1.

Let assumptions (10) - (12) be satisfied. Then the coefficients in (7) are well-defined and positive.

###### Proof.

If , the assertion follows from (7). If , then and the assertion is a consequence of [29, Lemma 3.1], applied to instead of . ∎

In the next lemma we prove an estimate for the step size of the l-SDK iteration.

###### Lemma 2.2.

Let and be defined by (5) and (7). Then

 αk∥sk∥2 ≤ ∥F[k](xδk)−yδ[k]∥2,k∈N. (15)
###### Proof.

It is enough to consider the case . It follows from (7) that

 Extra open brace or missing close brace (16)

Moreover, from the definition of we obtain

 ∥F′[k](xδk)sk∥ = ∥F′[k](xδk)F′[k](xδk)∗[F[k](xδk)−yδ[k]]∥, ∥sk∥2 ≤ ∥F′[k](xδk)F′[k](xδk)∗[F[k](xδk)−yδ[k]]∥∥F[k](xδk)−yδ[k]∥.

Now, substituting the last two expressions in (16), shows (15). ∎

The following Lemma is an important auxiliary result, which will be used at several places throughout this article.

###### Lemma 2.3.

Let , , , and be defined by (4) - (7) and assume that (10) - (12) hold true. If for some , then

 ∥ xδk+1−x∗∥2−∥xδk−x∗∥2 (17) ≤ωkαk∥F[k](xδk)−yδ[k]∥((2η−1)∥F[k](xδk)−yδ[k]∥+2(1+η)δ[k]).
###### Proof.

If , then and (17) follows with equality. If , it follows from (4) and (5) and Lemma 2.2 that

 ∥xδk+1−x∗∥2−∥xδk−x∗∥2 =2⟨xδk−x∗, xδk+1−xδk⟩+∥xδk+1−xδk∥2 =2αk⟨xδk−x∗, F′[k](xδk)∗(yδ[k]−F[k](xδk))⟩+α2k∥sk∥2 ≤2kαk⟨yδ[k]−F[k](xδk),F′[k](xδk)(xδk−x∗)⟩+αk∥F[k](xδk)−yδ[k]∥2 ≤αk(2⟨yδ[k]−F[k](xδk), F′[k](xδk)(xδk−x∗)−F[k](x∗)+F[k](xδk)⟩ +2⟨yδ[k]−F[k](xδk), y[k]−yδ[k]⟩−∥yδ[k]−F[k](xδk)∥2).

Now, applying (11) with and , leads to

 ∥xδk+1−x∗∥2−∥xδk−x∗∥2 ≤ωkαk∥F[k](xδk)−yδ[k]∥(2η∥F[k](xδk)−y[k]∥+2δ[k]−∥F[k](xδk)−yδ[k]∥).

The last inequality and (1) show (17).∎

Our next goal is to prove a monotony property, known to be satisfied by other iterative regularization methods, e.g., by the Landweber [7], the steepest descent [29], the LK [17], and the l-LK [11] method.

###### Proposition 2.4 (Monotonicity).

Under the assumptions of Lemma 2.3,

 ∥xδk+1−x∗∥2 ≤∥xδk−x∗∥2, k∈N. (18)

Moreover, all iterates remain in and satisfy (17).

###### Proof.

From (12) it follows that . If , then satisfies (18) with equality and . If , then Lemma 2.3 implies

 ∥xδ1−x∗∥2−∥xδ0−x∗∥2 ≥(2η−1)∥F0(xδ0)−yδ,0∥+2(1+η)δ0 ≥δ0((2η−1)τ+2(1+η))≥0.

Therefore (18), for , follows from (14). In particular, . An inductive argument implies (18) and that for all . The assertions therefore follows from Lemma 2.3. ∎

## 3 Convergence Analysis of the Loping Steepest Descent Kaczmarz Method

In this section we provide a complete convergence analysis for the l-SDK iteration, showing that it is a convergent regularization method in the sense of [7] (see Theorems 3.3 and 3.6 below). Throughout this section, we assume that (10) - (14) hold, and that , , , and are defined by (4) - (7).

Our first goal is to prove convergence of the l-SDK iteration for . For exact data , the iterates in (4) are denoted by .1

###### Lemma 3.1.

There exists an -minimal norm solution of (2) in , i.e., a solution of (2) such that

 ∥x†−x0∥=inf{∥x−x0∥:x∈Bρ/2(x0) and F(x)=y}.

Moreover, is the only solution of (2) in .

###### Proof.

Lemma 3.1 is a consequence of [13, Proposition 2.1]. A detailed proof can be found in [16]. ∎

###### Lemma 3.2.

For all , we have .

###### Proof.

For the claimed estimate holds with equality. If , it follows from (10) that

 ∥sk∥2/∥F′[k](xδk)sk∥2 ≥ ∥F′[k](xδk)∥−2 ≥ 1/M2.

Now the monotonicity of implies . ∎

Throughout the rest of this article, denotes the -minimal norm solution of (2). We define . From Proposition 2.4 it follows that (17) holds for all . By summing over all , this leads to

 ∞∑i=0αi∥y[i]−F[i](xi)∥2 ≤ ∥x0−x†∥21−2η < ∞. (19)

Equation (19) and the monotony of shown in Proposition 2.4 are main ingredients in the following proof of the convergence of the SDK iteration.

###### Theorem 3.3 (Convergence for Exact Data).

For exact data, the iteration converges to a solution of (2), as . Moreover, if

 N(F′(x†))⊆N(F′(x)) for all x∈Bρ(x0), (20)

then .

###### Proof.

From (18) it follows that decreases monotonically and therefore that converges to some . In the following we show that is in fact a Cauchy sequence.

For and with and , let be such that

 N−1∑i1=0∥Fi1(xNn0+i1)−yi1∥≤N−1∑i1=0∥Fi1(xNi0+i1)−yi1∥,i0∈{k0,…,l0}. (21)

Then, with , we have

 ∥ek−el∥≤∥ek−en∥+∥el−en∥ (22)

and

 ∥en−ek∥2 =∥ek∥2−∥en∥2+2⟨en−ek,en⟩. (23) ∥en−el∥2 =∥el∥2−∥en∥2+2⟨en−el,en⟩,

For , the first two terms of (23) converge to . Therefore, in order to show that is a Cauchy sequence, it is sufficient to prove that and converge to zero as .

To that end, we write , and set . Then, using the definition of the steepest descent Kaczmarz iteration it follows that

 |⟨ en−ek,en⟩| =∣∣∣n−1∑i=kαi⟨F′i1(xi)∗(yi1−Fi1(xi)),x†−xn⟩∣∣∣ ≤n−1∑i=kαi∣∣⟨yi1−Fi1(xi),F′i1(xi)(x†−xi∗)+F′i1(xi)(xi∗−xn)⟩∣∣ ≤n−1∑i=kαi∥yi1−Fi1(xi)∥∥F′i1(xi)(x†−xi∗)∥ +n−1∑i=kαi∥yi1−Fi1(xi)∥∥F′i1(xi)(xi∗−xn)∥ (24)

From (11) it follows that

 ∥F′i1(xi)(x†−xi∗)∥≤2(1+η)∥yi1−Fi1(xi)∥+(1+η)∥yi1−Fi1(xi∗)∥. (25)

Again using the definition of the steepest descent Kaczmarz iteration and equations (7), (10), it follows that

 ∥F′i1(xi)(xi∗− xn)∥≤M∥xi∗−xn∥ ≤MN−2∑j=i1αj∥F′j(xNn0+j)∗(Fj(xNn0+j)−yj)∥ ≤αmaxM2N−1∑j=0∥Fj(xNn0+j)−yj∥. (26)

Substituting (25), (26) in (24) leads to

 |⟨ en−ek,en⟩| ≤cn−1∑i0=k0N−1∑i1=0∥yi1−Fi1(xNi0+i1)∥(N−1∑j=0∥Fj(xNn0+j)−yj∥) ≤cn−1∑i0=k0(N−1∑i1=0∥yi1−Fi1(xNi0+i1)∥)2

with . Here we made use of (21). So, we finally obtain the estimate

 |⟨ en−ek,en⟩|≤Ncαminn−1∑i0=k0N−1∑i1=0αNi0+i1∥yi1−Fi1(xNi0+i1)∥2.

Because of (19), the last sum tends to zero for , and therefore . Analogously one shows that . Therefore is a Cauchy sequence and converges to an element . Because all residuals tend to zero, is solution of (2).

Now assume , for . Then from the definition of it follows that

 xk+1−xk∈R(F′[k](xk)∗)⊂N(F′[k](xk))⊥⊂N(F′(xk))⊥⊂N(F′(x†))⊥.

An inductive argument shows that all iterates are elements of . Together with the continuity of this implies that . By Lemma 3.1, is the only solution of (2) in , and so the second assertion follows. ∎

The second goal in this section is to prove that converges to a solution of (2), as . First we verify that, for noisy data, the stopping index defined in (8) is finite.

###### Proposition 3.4 (Stopping Index).

Assume . Then defined in (8) is finite, and

 ∥Fi(xδkδ∗)−yδi∥<τδi,i=0,…,N−1. (27)
###### Proof.

Assume that for every , there exists such that . From Proposition 2.4 follows that we can apply (17) recursively for and obtain

 −∥x0−x∗∥2≤ lN∑k=1ωkαk∥F[k](xδk)−yδ[k]∥ (2(1+η)δ[k]−(1−2η)∥F[k](xδk)−yδ[k]∥),l∈N.

Using the fact that either or , we obtain

 ∥x0−x∗∥2≥(τ(1−2η)−2(1+η))lN∑k=1ωkαkδ[k]∥F[k](xδk)−yδ[k]∥. (28)

Equation (28), Lemma 3.2 and the fact that for all , imply

 ∥x0−x∗∥2 ≥ (τ(1−2η)−2(1+η))lαminδmin(τδmin),l∈N. (29)

The right hand side of (29) tends to infinity, which gives a contradiction. Consequently, and the infimum in (8) takes a finite value.

To prove (27), assume to the contrary, that for some . From (6) and (8) it follows that, and respectively. Thus, Proposition 2.4 and Lemma 2.1 imply

 Missing or unrecognized delimiter for \Big

This contradicts (14), concluding the proof of (27). ∎

The last auxiliary result concerns the continuity of at . For , , and we define

 Δk(δ,y,yδ) :=ωkF′[k](xδk)∗(F[k](xδk)−yδ[k])−F′[k](xk)∗(F[k](xk)−y[k]).
###### Lemma 3.5.

For all ,

 limδ→0sup{∥Δk(δ,y,yδ)∥:yδ∈YN,∥yi−yδi∥≤δi}=0. (30)

Moreover, , as .

###### Proof.

We prove Lemma 3.5 by induction. The case is similar to the general case and is omitted.

Now, assume and that (30) holds for all . First we note that (30) and the continuity of obviously imply , as . For the proof of (30) we consider two cases. In the first case, , we have

 ∥Δk(δ,y,yδ)∥ =∥F′[k](xδk)∗(F[k](xδk)−yδ[k])−F′[k](xk)∗(F[k](xk)−y[k])∥.

In the second case, , we have and consequently

 ∥Δk (δ,y,yδ)∥≤∥F′[k](xk)∗(F[k](xk)−y[k])∥ ≤∥F′[k](xδk)∥(∥F[k](xk)−F[k](xδk)∥+∥F[k](xδk)−yδk∥+∥yδk−y[k]∥) ≤∥F′[k](xδk)∥(∥F[k](xk)−F[k](xδk)∥+(τ+1)δ[k]).

Now (30) follows from (10), the continuity of and , and the induction hypothesis (which implies ). ∎

###### Theorem 3.6 (Convergence for Noisy Data).

Assume is a sequence in with . Let be a sequence of noisy data satisfying

 ∥yji−yi∥≤δji,i=0,…,N−1,j∈N,

and let denote the corresponding stopping index defined in (8). Then converges to a solution of (2), as . Moreover, if (20) holds, then .

###### Proof.

Let denote the limit of the iterates which is a solution of (2), cf. Theorem 3.3. From Lemma