# Convergence analysis in convex regularization depending on the smoothness degree of the penalizer

## Abstract

The problem of minimization of the least squares functional with a smooth, lower semi-continuous, convex penalizer is considered to be solved. Over some compact and convex subset of the Hilbert space the regularizer is implicitly defined as where So the cost functional associated with some given linear, compact and injective forward operator

where is the given perturbed data with its perturbation amount in it. Convergence of the regularized optimum solution to the true solution is analysed depending on the smoothness degree of the penalizer, *i.e.* the cases in In both cases, we define such a regularization parameter that is in cooperation with the condition

for some fixed In the case of we are able to evaluate the discrepancy with the Hessian Lipschitz constant of the functional

Keywords.

convex regularization, Bregman divergence, Hessian Lipschitz constant, discrepancy principle.

## 1Introduction

In this work, over some compact and convex subset of the Hilbert space we consider solving formulate our main variational minimization problem,

Here, for is convex and is the regularization parameter. Following [10], we construct the parametrized solution for the problem (Equation 1) satisfying

For any there exists a solution to the problem (Equation 1);

For any there is no more than one

Convergence of the regularized solution to the true solution must depend on the given data,

*i.e.*whilst

where is the true measurement and is the noise level.

What is stated by ‘(iii)’ is that when the given measurement lies in some ball centered at the true measurement , then the expected solution must lie in the corresponding ball. It is also required that this solution must depend on the data Therefore, we are always tasked with finding an approximation of the unbounded inverse operator by a bounded linear operator

As alternative to well established Tikhonov regularization, [21], studying convex variational regularization with any penalizer has become important over the last decade. Introducing a new image denoising method named as *total variation*, [24], is commencement of this study. Application and analysis of the method have been widely carried out in the communities of inverse problems and optimization, [1]. Particularly, formulating the minimization problem as variational problem and estimating convergence rates with variational source conditions has also become popular recently, [6]. Different from available literature, we take into account one fact; for some given measurement with the noise level and forward operator the regularized solution to the problem (Equation 1) should satisfy for some fixed With this fact, we manage to obtain tight convergence rates for and we can carry out this analysis for a general smooth, convex penalty for the cases We will be able to quantify the tight convergence rates under the assumption that is defined over space for To be more specific, we will observe that rule for the choice of regularization paremeter must contain Lipschitz constant in addition to the noise level That is, when we will need class.

## 2Notations and prerequisite knowledge

Let be the space of continuous functions on the compact domain Then, function space

Addition to traditional spaces, we will need to address for the purpose of convergence analysis. In general for an open set a mapping is said to be of *class* if it is of class and th partial derivatives are not just continuous but strictly continuous on [23]. Then, for a smooth and convex functional defined over there exists Lipschitz constant such that

When by we denote well-known Lipschitz constant . When will be Hessian Lipschitz , [14].

Over some compact and convex domain variational minimization problem is formulated as such,

with its penalty where and is the regularization parameter. Another dual minimization problem to (Equation 3) is given by

In the Hilbert scales, it is known that the solution of the penalized minimizatin problem (Equation 3) equals to the solution of the constrained minimization problem (Equation 4), [6]. The regularized solution of the problem (Equation 3) satisfies the following first order optimality conditions,

In this work, the radii of the ball are estimated, by means of the Bregman divergence, with potential The choice of regularization parameter in this work does not require any *a priori* knowledge about the true solution. We always work with perturbed data and introduce the rates according to the perturbation amount

### 2.1Bregman divergence

We will be able to quantify the rate of the convergence of by means of different formulations of the Bregman divergence. Following formulation emphasizes the functionality of the Bregman divergence in proving the norm convergence of the minimizer of the convex minimization problem to the true solution.

Throughout our norm convergence estimations, we refer to this definition for the case of convexity. We will also study different formulations of the Bregman divergence. We introduce these different formulations below.

Reader may also refer to Appendix ? for further properties of the Bregman divergence. In fact, another similar estimation to ( ?), for can also be derived by making further assumption about the functional one of which is strong convexity with modulus [3]. Below is this alternative way of obtaining ( ?) when

Let us begin with considering the Taylor expansion of

Then the Bregman divergence

Since is striclty convex, due to strong convexity and hence one obtains that

where is the modulus of convexity.

Above, in ( ?), we have set In this case, one must assume even more than stated about the existence of the modulus of convexity These assumptions can be formulated in the following way. Suppose that there exists some measurement lying in the ball for all small enough such that the followings hold,

Then is convex and according to Proposition ?,

Addition to the traditional definition of Bregman divergence in ( ?), *symmetrical Bregman divergence* is also given below, [16],

With symmetrical Bregman divergence having formulated, following from the Definition ?, we give the last proposition for this chapter.

Proof is a straightforward result of the estimation in ( ?) and the symmetrical Bregman divergence definition given by (Equation 7).

### 2.2Appropriate regularization parameter with discrepancy principle

A regularization parameter is admissible for when

for some fixed We seek a rule for chosing as a function of such that (Equation 8) is satisfied and

Folllowing [13], [19], in order to obtain tight rates of convergence of we define such that

The strong relation between the discrepancy and the norm convergence of can be formulated in the following lemma.

Desired result follows from the following straightforward calculations,

## 3Monotonicity of the gradient of convex functionals

If the positive real valued convex functional is in the class of then for all defined on

What this inequality basically means is that at each the tangent line of the functional lies below the functional itself. The same is also true from subdifferentiability point of view. Following from (Equation 10), one can also write that

Still from (Equation 10), by replacing with one obtains

or equivalently

Combining (Equation 11) and (Equation 12) brings us,

Eventually this implies

which is the monotonicity of the gradient of convex functionals, [3].

Initially, owing to the relation in (Equation 13), it can easily be shown the weak convergence of the regularized solution to the true solution , with the choice of regularization parameter

Since is the minimizer of the cost functional then

which is in other words,

From the convexity of the penalization term a lower boundary has been already found in (Equation 13). Then following from (Equation 13), the last inequality implies,

since With the choice of for any desired result is obtained

## 4Convergence Results for

We now come to the point where we analyse each cases when for In each case, we will consider the discrepancy principle for the choice of regularization parameter while providing the norm convergence.

### 4.1When the penalty is defined over

First part of the following formulation has been studied in [6]. There, the authors obtain some convergence in terms of a Lagrange multiplier instead of a regularization parameter According to theoretical set up given by the authors, their convergence rate explicitly contain Lagrange multiplier defined as Second part, on the other hand, has been motivated by [16]. All convergence results are obtained under the assumption that the penalizer is convex according to ( ?).

First recall the formulation for the Bregman divergence associated with the penalty in ( ?). Convexity of the penalizer brings the following estimation by the second part of (Equation 13),

Then in fact ( ?) can be bounded by,

due to the first order optimality conditions in (Equation 5), *i.e.* The inner product can also be written in the composite form,

where the true solution satisfies Taking absolute value of the right hand side with Cauch-Schwarz inequality and recalling that by (Equation 9) brings

As for the upper bound for we adapt (Equation 7) in the following way

Again by the first order optimality conditions in (Equation 5), then

We split this inner product over the term together with the absolute value of each part as such,

which is the consequence of Cauchy-Schwarz. Now again by the condition in (Equation 9)

Considering the defined regularization parameter, both in (Equation 15) and in (Equation 16) yields the desired upper bounds for and respectively. Since is convex, then the norm convergence of is obtained due to ( ?).

In fact those rates also imply another faster convergence rate when the regularization parameter is defined as . To observe this, different formulation of the Bregman divergence is necessary. In the Definition ?, take to formulate the following. However, we need to recall the assumptions about the convexity of in (Equation 6) and ( ?).

As given by (Equation 9), Additionally the noisy measurement to the true measurement satisfies In the Theorem ? above, we have estimated a pair of convergence rates with the same regularization parameter So for defined by ( ?) will provide the result below;

As has been estimated in the Theorem ? when Hence,

Now, since is convex (see Def. ?), by ( ?) and by the assumptions (Equation 6) and ( ?), we have,

### 4.2When the penalty is defined over

Surely the convergence rates above are still preserved when the penalty is defined over since However, one may be interested in discrepancy principle in this more specific case. Above, we have formulated those convergence rates under the assumption We will now analyse the convergence with assuming Here we will define regularization parameter also as a function of Hessian Lipschitz constant , [14]. We begin with estimating the dicrepancy

Let us consider the following second order Taylor expansion,

Obviously, this Taylor expansion is bounded by

where is the Hessian Lipschitz constant of the functional After some arrangement with the explicit definition in the problem (Equation 3) the inequality above reads,

Now by the early estimations for the difference in (Equation 11),

After Cauchy-Schwarz and Young’s inequalities on the right hand side, we have

In the name of convenience, we combine the last two terms on the right hand side under one notation . Then,

Since for hence

## 5Summary of the Convergence Rates

In this work, we have obtained the convergence rates with following the footsteps of the counterpart works in [6]. However, we have also taken into account one more fact which is where fulfils the condition (Equation 9). It has been observed that convexity condition for the penalty is crucial to obtain norm covergence by means of Bregman divergence. We have not given any analytical evaluation of without any specific penalty Note that these convergence rates are true for where and Below we summarize these corresponding convergence rate estimations per Bregman divergence formulation.

Bregman divergence estimate | estimate | |
---|---|---|

## Acknowledgement

The author is indepted to Prof. Dr. D. Russell Luke for valuable discussions on different parts of this work.

## References

### References

- R. Acar, C. R. Vogel.
*Analysis of bounded variation penalty methods for ill-posed problems,*Inverse Problems, Vol. 10, No. 6, 1217 - 1229, 1994. - M. Bachmayr and M. Burger.
*Iterative total variation schemes for nonlinear inverse problems,*Inverse Problems, 25, 105004 (26pp), 2009. - H. H. Bauschke, P. L. Combettes.
*Convex analysis and monotone operator theory in Hilbert spaces,*Springer New York, 2011. **F. Schöpfer, T. Schuster.***Minimization of Tikhonov Functionals in Banach Spaces*, Abstr. Appl. Anal., Art. ID 192679, 19 pp, 2008 .

J. M. Bardsley and A. Luttman.*Total variation-penalized Poisson liklehood estimation for ill-posed problems,*Adv. Comput. Math., 31:25-59, 2009.- K. Bredies.
*A forward-backward splitting algorithm for the minimization of non-smooth convex functionals in Banach space*, Inverse Problems 25, no. 1, 015005, 20 pp, 2009. - M. Burger, S. Osher.
*Convergence rates of convex variational regularization,*Inverse Problems, 20(5), 1411 - 1421, 2004. - A. Chambolle, P. L. Lions.
*Image recovery via total variation minimization and related problems,*Numer. Math. 76, 167 - 188, 1997. - T. F. Chan and K. Chen.
*An optimization-based multilevel algorithm for total variation image denoising,*Multiscale Model. Simul. 5, no. 2, 615-645, 2006. - T. Chan, G. Golub and P. Mulet.
*A nonlinear primal-dual method for total variation-baes image restoration,*SIAM J. Sci. Comp 20: 1964-1977, 1999. - D. Colton and R. Kress.
*Inverse Acoustic and Electromagnetic Scattering Theory,*Springer Verlag Series in Applied Mathematics Vol. 93, Third Edition 2013. - D. Dobson, O. Scherzer.
*Analysis of regularized total variation penalty methods for denoising*, Inverse Problems, Vol. 12, No. 5, 601 - 617, 1996. - D. C. Dobson, C. R. Vogel.
*Convergence of an iterative method for total variation denoising*, SIAM J. Numer. Anal., Vol. 34, No. 5, 1779 - 1791, 1997. - H. W. Engl, M. Hanke, A. Neubauer.
*Regularization of inverse problems,*Math. Appl., 375. Kluwer Academic Publishers Group, Dordrecht, 1996. - J. M. Fowkes, N. I. M. Gould, C. L. Farmer.
*A branch and bound algorithm for the global optimization of Hessian Lipschitz continuous functions,*J. Glob. Optim., 56, 1792 - 1815, 2013. - M. Grasmair.
*Generalized Bregman distances and convergence rates for non-convex regularization methods*, Inverse Problems 26, 11, 115014, 16pp, 2010. - M. Grasmair.
*Variational inequalities and higher order convergence rates for Tikhonov regularisation on Banach spaces,*J. Inverse Ill-Posed Probl., 21, 379-394, 2013. - M. Grasmair, M. Haltmeier, O. Scherzer.
*Necessary and sufficient conditions for linear convergence of -regularization,*Comm. Pure Appl. Math. 64(2), 161-182, 2011. - V. Isakov.
*Inverse problems for partial differential equations.*Second edition. Applied Mathematical Sciences, 127. Springer, New York, 2006. - A. Kirsch.
*An introduction to the mathematical theory of inverse problems.*Second edition. Applied Mathematical Sciences, 120. Springer, New York, 2011. - D. A. Lorenz.
*Convergence rates and source conditions for Tikhonov regularization with sparsity constraints,*J. Inv. Ill-Posed Problems, 16, 463-478, 2008. - A. N. Tikhonov.
*On the solution of ill-posed problems and the method of regularization,*Dokl. Akad. Nauk SSSR, 151, 501-504, 1963. - A. N. Tikhonov, V. Y. Arsenin.
*Solutions of ill-posed problems.*Translated from the Russian. Preface by translation editor Fritz John. Scripta Series in Mathematics. V. H. Winston & Sons, Washington, D.C.: John Wiley & Sons, New York-Toronto, Ont.-London, xiii+258 pp, 1977. - R.T. Rockafellar, R. J.-B. Wets.
*Variational Analysis.*Fundamental Principles of Mathematical Sciences, 317. Springer-Verlag, Berlin, 1998. - L. I. Rudin, S. J. Osher, E. Fatemi.
*Nonlinear total variation based noise removal algorithms,*Physica D, 60, 259-268, 1992. - C. R. Vogel , M. E. Oman.
*Iterative methods for total variation denoising*, SIAM J. SCI. COMPUT., Vol. 17, No. 1, 227-238, 1996.