Variational Convergence Analysis

Variational Convergence Analysis With Smoothed-TV Interpretation

Erdem Altuntac Institute for Numerical and Applied Mathematics, University of Göttingen, Lotzestr. 16-18, D-37083, Göttingen, Germany
Abstract

The problem of minimization of the least squares functional with a Fréchet differentiable, lower semi-continuous, convex penalizer is considered to be solved. The penalizer maps the functions of Banach space into To be more precise, we also assume that some given measured data is defined on a compactly supported domain and in the class of Hilbert space, Then the general Tikhonov cost functional, associated with some given linear, compact and injective forward operator is formulated as

 Fα(φ,fδ): V×L2(Z) →R+, (φ,fδ)⟼ Fα(φ,fδ):=12||Tφ−fδ||2L2(Z)+αJ(φ).

Convergence of the regularized optimum solution to the true solution is analysed by means of Bregman distance.

First part of this work aims to provide some general convergence analysis for generally strongly convex functional in the cost functional . In this part the key observation is that strong convexity of the penalty term with its convexity modulus implies norm convergence in the Bregman metric sense. We also study the characterization of convergence by means of a concave, monotonically increasing index function with In the second part, this general analysis will be interepreted for the smoothed-TV functional ,

 JTVβ(φ):=∫Ω√∥∇φ(x)∥22+βdx,

where is a compact and convex domain. To this end, a new lower bound for the Hessian of will be estimated. The result of this work is applicable for any strongly convex functional.

Keywords. convex regularization, Bregman distance, smoothed total variation.

1 Introduction

As alternative to well established Tikhonov regularization, [33, 34], studying convex variational regularization with some general penalty term has become important over the last decade. Introducing a new image denoising method named as total variation, [36], is commencement of such study. Application and analysis of the method have been widely carried out in the communities of inverse problems and optimization, [1, 4, 6, 12, 13, 14, 17, 18, 40]. Particularly, formulating the minimization problem as variational problem and estimating convergence rates with variational source conditions has also become popular recently, [11, 23, 24, 25, 32].

Problem of finding the optimum minimizer for a general Tikhonov type functional is formulated below

 φα(δ)∈argminφ∈V{12||Tφ−fδ||2H+αJ(φ)}. (1.1)

Here, is the convex penalty term and it is smooth in the Fréchet derivative sense with the regularization parameter before it.

This work aims to utilize convex analysis together with Bregman distance as two fundamental concepts to arrive at convergence and convergence rates in convex regularization strategy. In particular, it will be observed that the strong convexity provides new quantitative analysis for the Bregman distance which also implies norm convergence. We will interprete this observation for the smoothed-TV functional, [14, 17],

 JTVβ(φ):=∫Ω√∥∇φ(x)∥22+βdx.

Eventually, it will be shown that the strong convexity of requires the solution to be in the class of the Sobolev space

We rather focus on a posteriori strategy for the choice of regularization parameter and this does not require any a priori knowledge about the true solution. We always work with the given perturbed data and introduce the rates according to the perturbation amount Under this a posteriori strategy and the assumed deterministic noise model, in the measurement space, the following rates will be able to be quantified;

1. the discrepancy between and by the rate of

2. upper bound for the Bregman distance which will immediately imply the desired norm convergence

3. convergence of the regularized solution to the true solution by the rate of the index function

2 Notations and prerequisite knowledge

2.1 Functional analysis notations

Let be the space of continuous functions on a compact domain with its Lipschitz boundary Then, the function space is defined by

 Ck(Ω):={φ∈C(Ω):Dσ(φ)∈Lp(Ω), ∀σ∈N % with |σ|≤k}.

We will also need to work with Sobolev spaces. We define Sobolev space for by,

 Wk,p(Ω):={φ∈Lp(Ω):Dσ(φ)∈Lp(Ω), ∀σ∈N with |σ|≤k}.

We also denote another Sobolev function space with zero boundary value by

 Wk,p0(Ω):={φ∈C(Ω)|Dσ(φ)∈Lp(Ω) ∀σ∈N with |σ|≤k, and φ(x)=0 for x∈∂Ω}.

It is also worthwhile to recall the density argument, [21, Subsection 5.2.2], in

 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C∞c(Ω)=Wk,p0(Ω).

In this work, we focus on the total variation (TV) of a class function. TV of a function defined over the compact domain is given below.

Definition 2.1 (TV(φ,Ω)).

[37, Definition 9.64] Over the compact domain total variation of a function is defined by the following variational form

 TV(φ,Ω):=supΦ∈C1c(Ω){∫Ωφ(x)divΦ(x)dx : ||Φ||∞≤1} (2.1)

Total variation type regularization targets the reconstruction of bounded variation (BV) class of functions that are defined by

 BV(Ω):={φ∈L1(Ω):TV(φ,Ω)<∞} (2.2)

with the norm

 ||φ||BV:=||φ||L1+TV(φ,Ω). (2.3)

BV function spaces are Banach spaces, [39]. Furthermore, if a function is in the class of Sobolev space it is also in the space of (see [1] and [39, Proposition 8.13]). By the result in [1, Theorem 2.1], it is known that one can arrive, with a proper choice of at the following formulation from (2.1),

 TV(φ)=∫Ω||∇φ(x)||2dx≅∫Ω(||∇φ(x)||22+β)1/2dx=JTVβ(φ), (2.4)

where is fixed. We also refer to [12, 14, 17, 36, 40] where (2.4) has appeared.

2.2 Some motivation for general regularization theory

For the given linear, injective and compact forward operator over some compact and convex domain we formulate the following smooth, convex variational minimization,

 argminφ∈V{12||Tφ−fδ||2H+αJ(φ)} (2.5)

with its penalty and the regularizatin parameter Another dual minimization problem to (2.5) is given by

 J(φ)→minφ∈V, subject to ||Tφ−fδ||H≤δ. (2.6)

Following from the problem 2.5, in what follows, the general Tikhonov type cost functional with convex penalty term is then formulated by

 Fα(φ,fδ):=12||Tφ−fδ||2H+αJ(φ). (2.7)

In the Hilbert scales, it is known that the solution of the penalized minimizatin problem (2.5) equals to the solution of the constrained minimization problem (2.6), [11, Subsection 3.1]. The regularized solution of the problem (2.5) satisfies the following first order optimality conditions,

 0 =∇Fα(φα(δ)) (2.8) 0 =T∗(Tφα(δ)−fδ)+α(δ)∇J(φα(δ)) T∗(fδ−Tφα(δ)) =α(δ)∇J(φα(δ)).

The choice of regularization parameter in this work does not require any a priori knowledge about the true solution. We always work with perturbed data and introduce the rates according to the perturbation amount Throughout stability analysis here, we consider the classical deterministic noise model

 fδ∈Bδ(f†),\mbox{ i.e., }||f†−fδ||≤δ.

2.3 Bregman distance as a vital tool for the norm convergence

Definition 2.2.

[Bregman distance][10] Let be a convex functional and smooth in the Fréchet derivative sense. Then, for Bregman distance associated with the functional is defined by

 DP(u,u∗)=P(u)−P(u∗)−⟨∇P(u∗),u−u∗⟩. (2.9)

Following formulation emphasizes the functionality of the Bregman distance in proving the norm convergence of the minimizer of the convex minimization problem to the true solution.

Definition 2.3.

[Total convexity][9, Definition 1]

Let be a Fréchet differentiable convex functional. Then is called totally convex in if,

 P(u)−P(u∗)−⟨∇P(u∗),u−u∗⟩→0⇒||u−u∗||V→0.

It is said that is q-convex in with a if for all there exists a such that for all we have

 P(u)−P(u∗)−⟨∇P(u∗),u−u∗⟩≥c∗||u−u∗||qV. (2.10)

Throughout our norm convergence estimations, we refer to this definition for the case of convexity. We will also study different formulations of the Bregman distance. Common usage of the Bregman distance is to associate it with the penalty term appears in the problem (2.5). Here, we also make use of different examples of the Bregman distance.

Remark 2.4.

[Examples of the Bregman distance] Let be the regularized and the true solutions of the problem (2.5) respectively. Then we give the following examples of the Bregman distance;

• Bregman distance associated with the cost functional

 DFα(φα(δ),φ†)=Fα(φα(δ),fδ)−Fα(φ†,fδ)−⟨∇Fα(φ†,fδ),φα(δ)−φ†⟩, (2.11)
• Bregman distance associated with the penalty

 DJ(φα(δ),φ†)=J(φα(δ))−J(φ†)−⟨∇J(φ†),φα(δ)−φ†⟩. (2.12)

Composite form of the classical Bregman distance brings another formulation of it named as symmetrical Bregman distance, [24, Definition 2.1], and defined by

 DsymP(u,u∗):=DP(u,u∗)+DP(u∗,u). (2.13)

Inherently, symmetric Bregman distance is also useful for showing norm convergence as established below.

Proposition 2.5.

[24, as appears in the proof of Theorem 4.4] Let be a smooth and q-convex functional. Then there exists positive constant such that for all we have

 DsymP(u,u∗) = ⟨∇P(u∗)−∇P(u),u∗−u⟩ (2.14) ≥ c∗||u−u∗||2V.
Proof.

Proof is a straightforward result of the estimation in (2.10) and the symmetrical Bregman distance definition given by (2.13). ∎

In Definition 2.3 by the estimation in (2.10), it has been stated that the norm convergence is guarenteed in the presence of some positive real valued constant to bound the Bregman distance, given by (2.9), from below. It is possible to derive an alternative estimation to (2.10), or to well known Xu-Roach inequalities in [41], in the case of by making further assumption about the functional which is strong convexity with modulus [5, Definition 10.5]. Below, we formulate the first result of this work which is the base of our norm estimations in the analysis. We introduce another notation before giving our formulation. From some reflexive Banach space to let and Then means that for all

Proposition 2.6.

Over the compact and convex domain let be some strongly convex and twice continuously differentiable functional. Then the Bregman distance can be bounded below by

 DP(u,v)≥c||u−v||2L2(Ω), (2.15)

where the modulus of convexity satisfies

Proof.

Let us begin with considering the Taylor expansion of

 P(u)=P(v)+⟨P′(v),u−v⟩+12⟨P′′(v)(u−v),u−v⟩+o(||u−v||2L2(Ω)). (2.16)

where is the remainder given in the integral form by

 R2(u−v)=16∫10P′′′(v+t(u−v))⋅((1−t)(u−v))2(u−v)dt.

 DP(u,v) = P(u)−P(v)−⟨P′(v),u−v⟩ = ⟨P′(v),u−v⟩+12⟨P′′(v)(u−v),u−v⟩+o(||u−v||2L2(Ω))−⟨P′(v),u−v⟩ = 12⟨P′′(v)(u−v),u−v⟩+o(||u−v||2L2(Ω)).

Since is strictly convex and due to strong convexity, one eventually obtains that

 DP(u,v)≥c||u−v||2L2(Ω), (2.17)

where is the modulus of convexity.

2.4 Choice of regularization parameter with Morozov’s discrepancy principle

We are also concerned with asymptotic properties of the regularization parameter for the Tikhonov-regularized solution obtained by Morozov’s discrepancy principle. Morozov’s discrepancy principle (MDP) serves as an a posteriori parameter choice rule for the Tikhonov type cost functionals (2.7) and has certain impact on the convergence of the regularized solution for the problem in (2.5) with some general convex penalty term As has been introduced in [2, Theorem 3.10] and [3], we will make use of the following set notations in the theorem formulations that are necessary to prove the norm convergence of the solution to the true solution for the problem (2.5).

 ¯¯¯¯S := {α:||Tφα(δ)−fδ||L2(Z)≤¯¯¯τδ for some φα(δ)∈argminφ∈V{Fα(φ,fδ)}}, (2.18) S–– := {α:τ––δ≤||Tφα(δ)−fδ||L2(Z) for some% φα(δ)∈argminφ∈V{Fα(φ,fδ)}}, (2.19)

where are fixed. Analogously, as well known from [20, Eq. (4.57) and (4.58)], [31, Definition 2.3], in order to obtain tight rates of convergence of we are interested in sucha a regularization parameter with some fixed that

 (2.20)

3 Variational Convergence Analysis

Due to sophisticated nature of the TV penalty term in convex/non-convex minimization problems, variational inequalities in convergence analysis for the minimization problems in the form of (2.5) is useful. The title name of this section solely expresses the duty of the variational inequalities in convergence analysis. As alternative to well established Tikhonov regularization, [33, 34], studying convex regularization strategy has been initiated by introducing a new image denoising method named as total variation, [36]. Particularly, formulating the minimization problem as variational problem and estimating convergence rates with considering source conditions in variational inequalities has also become popular recently, [11, 23, 24, 25, 32] and references therein.

Recall the facts that classical deterministic noise model and the convexity of the penalty term of our minimization problem (2.5) are taken into account throughout our analysis. Under some a posteriori strategy together with the aforementioned assumptions, we will quantify the following rates;

1. the discrepancy between and by the rate of

2. upper bound for the Bregman distance which will immediately imply the desired norm convergence

3. convergence of the regularized solution to the true solution by the rate of the index function

3.1 Choice of the regularization parameter with Morozov’s discrepancy principle

We are also concerned with asymptotic properties of the regularization parameter for the Tikhonov-regularized solution obtained by Morozov’s discrepancy principle. Morozov’s discrepancy principle (MDP) serves as an a posteriori parameter choice rule for the Tikhonov type cost functionals (2.7) and has certain impact on the convergence of the regularized solution for the problem in (2.5) with some general convex penalty term As has been introduced in [2, Theorem 3.10] and [3], we will make use of the following set notations in the theorem formulations that are necessary to prove the norm convergence of the solution to the true solution for the problem (2.5).

 ¯¯¯¯S := {α:||Tφα(δ)−fδ||L2(Z)≤¯¯¯τδ for some φα(δ)∈argminφ∈V{Fα(φ,fδ)}}, (3.1) S–– := {α:τ––δ≤||Tφα(δ)−fδ||L2(Z) for some% φα(δ)∈argminφ∈V{Fα(φ,fδ)}}, (3.2)

where are fixed. Analogously, as well known from [20, Eq. (4.57) and (4.58)], [31, Definition 2.3], in order to obtain tight rates of convergence of we are interested in such a regularization parameter with some fixed that

 (3.3)

3.2 Variational inequalities for norm convergence

Convergence rates results for some general operator can be obtained by formulating variational inequality which uses the concept of index functions. A function is called index function if it is continuously defined, monotonically increasing and

Assumption 3.1.

[Variational Inequality][23, Eq. 1], [27, Eq 1.5], [29, Eq 2] There exists some constant and an index function for all such that

 ~γ||φ−φ†||2L2(Ω)≤J(φ)−J(φ†)+Ψ(||Tφ−Tφ†||L2(Z)). (3.4)
Lemma 3.2.

For the cost functional defined by

 Fα(φ,fδ):=12||Tφ−fδ||2L2(Z)+αJ(φ),

with some Fréchet differentiable and convex penalty term that is defined on a Hilbert space let Then for all and any regularization parameter

 α⟨∇J(φ),φα−φ⟩≤⟨T∗(Tφ−fδ),φ−φα⟩. (3.5)
Proof.

Since is the minimum of the cost functional then, it is hold that for all and Now, recall the Bregman distance formulation associated with the cost functional in (2.11).

 0≤DFα(φα,φ) = Fα(φα)−Fα(φ)−⟨∇Fα(φ),φα−φ⟩ (3.6) ≤ −⟨∇Fα(φ),φα−φ⟩ = ⟨∇Fα(φ),φ−φα⟩

We, by the definition of the cost functional in (2.7), have that

 0≤⟨T∗(Tφ−fδ)+α∇J(φ),φ−φα⟩, (3.7)

which yields the assertion. ∎

It is also an immediate consequence of MDP, see [3, Remark 2.7], that

 ||Tφα(δ)−Tφ†||L2(Z)≤(¯¯¯τ+1)δ. (3.8)

We use this observation to formulate the following theorem. The first assertion below is an expected result for minimization problems given by (2.5), see e.g. [27, Lemma 1].

Theorem 3.3.

Under the same assumption in Lemma 3.2 together with then we, for any have that

 J(φα)−J(φ†) ≤ δ22α. (3.9)

Moreover, for the Bregman distance is bounded above by

 DJ(φα(δ,fδ),φ†) ≤ δ2α(δ,fδ)(32+¯¯¯τ). (3.10)
Proof.

Since for any is the minimizer of the cost functional then

 Fα(φα,fδ) = 12||Tφα−fδ||2L2(Z)+αJ(φα) ≤ 12||Tφ†−fδ||2L2(Z)+αJ(φ†)=Fα(φ†,fδ),

which is in other words,

 α(J(φα)−J(φ†))≤12||Tφ†−fδ||2L2(Z)−12||Tφα−fδ||2L2(Z). (3.11)

By the assumed deterministic noise model and the fact that one obtains the first assertion

 J(φα)−J(φ†)≤δ22α.

Regarding second assertion, since by the definition in (3.1), From the formulation of Bregman distance (2.12) and Lemma 3.2, we obtain

 DJ(φα(δ,fδ),φ†) ≤ ∣∣J(φα(δ,fδ))−J(φ†)∣∣+∣∣⟨∇J(φ†),φα(δ,fδ)−φ†⟩∣∣ ≤ δ22α(δ,fδ)+δα(δ,fδ)||Tφα(δ,fδ)−Tφ†||L2(Z).

Hence, the observation in (3.8) yields the second assertion.

Obtaining tight rates of convergence with an a posteriori strategy for the choice of regularization parameter is the aim of this chapter. Henceforth, we will show the impact of this strategy on the convergence and convergence rates by associating it with the index function that has appeared in Assumption 3.1. In [27, Eq (3.2)], a reasonable index function has been introduced. We, in analogous with that function in the regarding work, introduce

 Φ(δ,fδ):=δ22Ψ(δ). (3.12)

From this index function, it is possible to be able to formulate an improved counterpart of the result in [27, Corollary 1]. Firstly, we give a preliminary estimate result based on the variational inequality.

Lemma 3.4.

[27, Lemma 2] Let, for some satisfy Assumption (3.1). Then

 ||Tφα−Tφ†||2L2(Z)≤4δ2+4αΨ(||Tφα−Tφ†||L2(Z)),

where is the true solution for the problem (2.5).

We are now ready to introduce our result which is comparable with [27, Corollary 1]. In our formulation, we still follow a posteriori rule of choice of the regularization parameter as has been introduced in (3.1).

Corollary 3.5.

Under the same assumption in Lemma 3.4, if the regularization parameter is chosen as

 α(δ,fδ):=Φ(δ,fδ), (3.13)

then we have

 ||Tφα(δ,fδ)−Tφ†||L2(Z)≤δ√6+2¯¯¯τ, (3.14)

where fixed satisfies

Proof.

By the defined index function in (3.12) and the result in Lemma 3.4, we immediately obtain,

 ||Tφα(δ,fδ)−Tφ†||2L2(Z) ≤ 4δ2+4α(δ,fδ)Ψ(||Tφα(δ,fδ)−Tφ†||L2(Z)) =\lx@notemarkfootnote 4δ2+2δ2Ψ(δ)Ψ(δδ||Tφα(δ,fδ)−Tφ†||L2(Z)) =\lx@notemarkfootnote 4δ2+2δ||Tφα(δ,fδ)−Tφ†||L2(Z) ≤\lx@notemarkfootnote 4δ2+2δ2(¯¯¯τ+1)=δ2(6+2¯¯¯τ).

With the introduced index function in (3.12), it is essential to be able to find lower bound for the regularization parameter

Corollary 3.6.

Suppose that, for a chosen regularization parameter that is defined in (3.2), the regularized solution to the problem (2.5) satisfies the variational inequality in Assumption 3.1. Then the regularization parameter can be bounded below as such,

 12(τ––−1)2τ––2−1τ––2+1Φ(δ,fδ)≤α(δ,fδ). (3.15)
Proof.

Since and the regularized solution satisfies the assertion in Assumption 3.1, we immediately obtain,

 ≤ δ22+α(J(φ†)−J(φα(δ))) ≤ δ22+αΨ(||Tφα(δ,fδ)−Tφ†||L2(Z)),

and this follows up

 δ2≤2ατ––2−1Ψ(||Tφα(δ,fδ)−Tφ†||L2(Z)). (3.16)

We plug this into the bound in Lemma 3.4 with the abbreviation

 p2α≤4δ2+4αΨ(pα) ≤ 8ατ––2−1Ψ(pα)+4αΨ(pα) (3.17) = 4αΨ(pα)τ––2+1τ––2−1.

Note that

 τ––δ≤||Tφα(δ,fδ)−fδ||L2(Z)≤pα+δ

which implies

 (τ––−1)δ≤pα. (3.18)

Hence, from (3.17),

 12(τ––−1)2τ––2−1τ––2+1Φ(δ,fδ)≤α. (3.19)

Theorem 3.7.

Suppose that the regularized solution to the problem (2.5) obeys Assumption 3.1, for some regularization parameter satisfying

 τ––δ≤||Tφα(δ,fδ)−fδ||L2(Z)≤¯¯¯τδ,

where are fixed and with the lower bound in Corollary 3.6. Then, by the second assertion (3.10) in Theorem 3.3, the Bregman distance can be bounded by

 DJ(φα(δ,fδ),φ†)≤O(Ψ(δ)). (3.20)
Proof.

Corrollary 3.6 and the index function defined by (3.12) provide the result

 DJ(φα(δ,fδ),φ†) ≤ δ212(τ––−1)2τ––2−1τ––2+1Φ(δ,fδ)(32+¯¯¯τ) = 4Ψ(δ)(τ––2+1)(τ––−1)3(τ––+1)(32+¯¯¯τ).

Theorem 3.8.

Let be the compact and linear operator. Over the compact and convex domain let satisfy the assumption of Lemma 3.2 and Assumption 3.1. If the regularization parameter is chosen as where is defined by (3.12) with some given noisy measurement then one can find the following upper bound for the symmetric Bregman distance,

 DJ(φα(δ,fδ),φ†)≤DsymJ(φα(δ,fδ),φ†)≤1ϵ(1~γ+1)Ψ