Epigraphical splitting for solving constrained convex formulations of inverse problems with proximal tools

Epigraphical splitting for solving constrained convex formulations of inverse problems with proximal tools

G. Chierchia, N. Pustelnik, J.-C. Pesquet, and B. Pesquet-Popescu11footnotemark: 1 Institut Mines-Télécom, Télécom ParisTech, CNRS LTCI, 75014 Paris, FranceENS Lyon, Laboratoire de Physique, UMR CNRS 5672, F69007 Lyon, FranceUniversité Paris-Est, LIGM, UMR CNRS 8049, 77454 Marne-la-Vallée, France
Abstract

We propose a proximal approach to deal with a class of convex variational problems involving nonlinear constraints. A large family of constraints, proven to be effective in the solution of inverse problems, can be expressed as the lower level set of a sum of convex functions evaluated over different, but possibly overlapping, blocks of the signal. For such constraints, the associated projection operator generally does not have a simple form. We circumvent this difficulty by splitting the lower level set into as many epigraphs as functions involved in the sum. A closed half-space constraint is also enforced, in order to limit the sum of the introduced epigraphical variables to the upper bound of the original lower level set. In this paper, we focus on a family of constraints involving linear transforms of distance functions to a convex set or norms with . In these cases, the projection onto the epigraph of the involved function has a closed form expression.

The proposed approach is validated in the context of image restoration with missing samples, by making use of constraints based on Non-Local Total Variation. Experiments show that our method leads to significant improvements in term of convergence speed over existing algorithms for solving similar constrained problems. A second application to a pulse shape design problem is provided in order to illustrate the flexibility of the proposed approach.

1 Introduction

As an offspring of the wide interest in frame representations and sparsity promoting techniques for data recovery, proximal methods have become popular for solving large-size non-smooth convex optimization problems [1, 2, 3]. The efficiency of these methods in the solution of inverse problems has been widely studied in the recent signal and image processing literature (see for instance [4, 5, 6, 7, 8, 9] and references therein). Even if proximal algorithms and the associated convergence properties have been deeply investigated [10, 11, 12, 13], some questions persist in their use for solving inverse problems. A first question is: how can we set the parameters serving to enforce the regularity of the solution in an automatic way? Various strategies were proposed in order to address this question [14, 15, 16, 17, 18], but the computational cost of these methods is often high, especially when several regularization parameters have to be set. Alternatively, it has been recognized for a long time that incorporating constraints directly on the solutions [19, 20, 21, 22, 23, 24, 25], instead of considering regularized functions, may often facilitate the choice of the involved parameters. Indeed, in a constrained formulation, the constraint bounds are usually related to some physical properties of the target solution or some knowledge of the degradation process, e.g. the noise statistical properties. Note also that there exist some conceptual Lagrangian equivalences between regularized solutions to inverse problems and constrained ones, although some caution should be taken when the regularization functions are nonsmooth (see [26] where the case of a single regularization parameter is investigated).

Another question is related to the selection of the most appropriate algorithm within the class of proximal methods according to a given application. This also raises the question of the computation of the proximity operators associated with the different functions involved in the criterion. In this context, the objective of this paper is to propose an efficient splitting technique for solving some constrained convex optimization problems of the form:

Problem 1.1.
 minimizex∈HR∑r=1gr(Trx)s.t.⎧⎨⎩H1x∈C1,…HSx∈CS, (1)

where is a real Hilbert space, and

1. for every , is a bounded linear operator from to ,

2. for every , is a proper lower-semicontinuous convex function,

3. for every , is a bounded linear operator from to ,

4. for every , is a nonempty closed convex subset of .

More precisely, the present work aims at designing a method to address Problem 1.1 when some convex constraints are expressed as follows: for some ,

 (∀x∈H)Hsx∈Cs⇔hs(Hsx)≤ηs, (2)

where and is a proper lower-semicontinuous convex function from to . Indeed, the projection onto the convex set defined in (2) often does not have a closed form expression. In the present work, we will show that:

1. when the function in (2) corresponds to a decomposable loss, i.e. it can be expressed as the sum of functions evaluated over different blocks of the vector , the problem of computing the projection onto the associated convex set can be addressed by resorting to a splitting approach that decomposes the set into a collection of epigraphs and a half-space;

2. the projection operator associated with an epigraph (namely the epigraphical projection) has a closed form for some functions of practical interest, such as the absolute value raised to a power , the distance to a convex set and the -norm with ;

3. in the context of image restoration, regularity constraints based on Total Variation [27] and Non-Local Total Variation [28] can be efficiently handled by the proposed epigraphical splitting, which significantly speeds up the convergence (in terms of execution time) with respect to standard iterative solutions [29, 30].

The paper is organized as follows. In Section 2, we review the algorithms which are applicable for solving large-size convex optimization problems, so motivating the choice of proximal methods, and we review the variable-splitting techniques commonly used with these methods. In order to deal with a constraint expressed under the form (2), we propose in section 3 a novel splitting approach involving an epigraphical projection. In addition, closed form expressions for specific epigraphical projections are given. Experiments in two different contexts are presented in Section 4. The first ones concern an image reconstruction problem, while the second ones are related to pulse shape design for digital communications. Finally, some conclusions are drawn in Section 5.

Notation: Let be a real Hilbert space endowed with the norm and the scalar product . denotes the set of proper lower-semicontinuous convex functions from to . The epigraph of is the nonempty closed convex subset of defined as and the lower level set of at height is the nonempty closed convex subset of defined as . A subgradient of at is an element of its subdifferential defined as . When is Gâteaux-differentiable at , where is the gradient of at . Let be a nonempty closed convex subset of . The relative interior of is denoted by . For every , the indicator function of is given by

 ιC(y)={0,if y∈C,+∞,otherwise, (3)

the projection onto reads , and the distance to is given by .

2 Proximal tools

2.1 From gradient descent to proximal algorithms

The first methods for finding a solution to an inverse problem were restricted to the use of a differentiable cost function [31], i.e. Problem 1.1 where and, for every , denotes a differentiable function. In this context, gradient-based algorithms appear to be the most efficients solutions when an iterative procedure is required (see [32] and references therein). However, in order to model additional properties, sparsity promoting penalizations () or hard constraints () may be introduced and the diffentiability property is not satisfied anymore. One way to circumvent this difficulty is to resort to smart approximations in order to smooth the involved non-differentiable functions [33, 34, 35, 36]. If one wants to address the original nonsmooth problem without approximation errors, one may apply some specific algorithms [37], the convergence of which is guaranteed under restrictive assumptions. Interior point methods [38] can also be employed for small to medium size optimization problems.

On the other hand, in order to solve convex feasibility problems, i.e. to find a vector belonging to the intersection of convex sets (Problem 1.1 with ), iterative projection methods were developed. The projection onto convex sets algorithm (POCS) is one of the most popular approach to solve data recovery problems [39, 40, 19, 41]. A drawback of POCS is that it is not well-suited for parallel implementations. The Parallel Projection Method (PPM) and Method of Parallel Projections (MOPP) are variants of POCS making use of parallel projections. Moreover, these algorithms were designed to efficiently solve inconsistent feasibility problems (when the intersection of the convex set is empty). Thorough comparisons between projection methods have been performed in [42, 43].

Computing the projection onto a nonempty closed convex subset of a real Hilbert space requires to solve a constrained quadratic minimization problem. However, it turns out that a closed form expression of the solution to this problem is available in a limited number of instances. Some well-known examples are the projections onto hyperplanes, closed half-spaces and -norm balls [36, 44]. When an expression of the direct projection is not available, the convex set can be approximated by a half-space, which leads to the concept of subgradient projection. An efficient block iterative surrogate splitting method was proposed in [45] in order to solve Problem 1.1 when and, for every , where . A main limitation of this method is that the global objective function must be strictly convex. For recent works about subgradient projection methods, the readers may refer to [46, 47, 48].

A way to overcome this difficulty consists of considering proximal approaches. The key tool in these methods is the proximity operator [49] of a function , defined as

 (∀y∈H)proxφ(y)=argminu∈H12∥u−y∥2+φ(u). (4)

The proximity operator can be interpreted as a sort of subgradient step for the function , as is uniquely defined through the inclusion

 y−p∈∂φ(p). (5)

Proximity operators enjoy many interesting properties [11]. In particular, they generalize the notion of projection onto a closed convex set , in the sense that . Hence, proximal methods provide a unifying framework that allows one to address non-smooth penalizations () and hard constraints ().

The class of proximal methods includes primal algorithms [50, 11, 12, 51, 52, 53, 54, 2, 55] and primal-dual algorithms [56, 57, 58, 59, 60, 61, 62, 63]. Primal algorithms generally require to inverse some linear operators (e.g. ), while primal-dual ones only require to compute , and their adjoints. Consequently, primal-dual methods are often easier to implement than primal ones, but their convergence may be slower [64, 65]. Note also that some of these methods are closely related to augmented Lagrangian approaches [66, 67].

2.2 Variable-splitting techniques

With the development of proximal methods for solving convex optimization problems, many techniques have been developed to cope with the intrinsic limitations of these methods. In particular, the computation of the proximity operator becomes intractable in general when considering the sum of several functions or a function composed with a linear operator. In this context, variable-splitting constitutes a very effective way to design easily implementable algorithms. One of the most popular examples is given by the approaches inspired by the Alternating Direction Method of Multipliers (ADMM), which deal with optimization problems of the form

 minimizex∈Hg1(T1x)+g2(x). (6)

By introducing an auxiliary variable , the problem is reformulated as

 minimize(x,v)∈H×RN1g1(v)+g2(x)s.t.T1x=v. (7)

This kind of splitting technique has been often used in image restoration [68, 66] and more recently for distributed optimization problems [69]. A similar form of splitting has been considered in [7, 70], where the constraint is handled by computing the projection onto the nullspace of the linear operator , which has a closed-form expression for some specific choices of , such as circulant matrices involved in image restoration.

The solution that we will propose in this work also introduces auxiliary variables. However, our objective is not to deal with linear transformations of the data but with a projection which does not have a closed-form expression. Consequently, the proposed solution departs from the usual splitting methods, in the sense that our approach leads to a collection of epigraphs and a half-space constraint sets, while the usual splitting techniques yield linear constraints.

3 Proposed method

We now turn our attention to convex sets for which the associated projection does not have a closed form and we show that, under some appropriate assumptions, it is possible to circumvent this difficulty. In Problem 1.1, assume that denotes such a constraint and that it can be modelled as: for every ,

 y∈C1⇔h1(y)=L1∑ℓ=1h(ℓ)1(y(ℓ))≤η1 (8)

where . Hereabove, the generic vector has been decomposed into blocks of coordinates as follows

 y=[(y(1))⊤sizeM(1)1,…,(y(L))⊤sizeM(L1)1]⊤ (9)

and, for every , and is a function in such that .

3.1 Epigraphical splitting

The idea underlying our approach consists of introducing an auxiliary vector , so that Constraint (8) can be equivalently rewritten as 000Note that the inequality in (10) can be also replaced by an equality, even though it makes little difference in our approach.

 L1∑ℓ=1ζ(ℓ)1≤η1, (10) (∀ℓ∈{1,…,L1})h(ℓ)1(y(ℓ))≤ζ(ℓ)1. (11)

Let us now introduce the closed half-space of defined as

 V1={ζ∈RL1 ∣∣ 1⊤L1ζ≤η1}, (12)

with , and the closed convex set

 (13)

Then, Constraint (10) means that , while Constraint (11) is equivalent to . In other words, the constraint can be split into the two constraints and , provided that an additional vector is introduced in Problem 1.1. The resulting criterion takes the form:

Problem 3.1.
 minimize(x,ζ1)∈H×V1R∑r=1gr(Trx)s.t.⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩(H1x,ζ1)∈E1H2x∈C2⋮HSx∈CS. (14)

Note that the additional constraints can be easily handled by proximal algorithms as far as the projections onto the associated constraint sets can be computed. In the present case, the projection onto is well-known [36], whereas the projection onto is given by

 (∀(y,ζ)∈RM1×RL1)PE1(y,ζ)=(p,θ) (15)

where , vector is blockwise decomposed as like in (9), and

 (∀ℓ∈{1,…,L1})(p(ℓ),θ(ℓ))=Pepih(ℓ)1(y(ℓ),ζ(ℓ)). (16)

Hence, the problem reduces to the lower-dimensional problem of the determination of the projection onto the convex subset of for each . An example of an algorithm that converges to a solution to Problem 3.1 will be provided in Section 4.

3.2 Proximity operators: new closed forms

The key point in the proposed splitting is the introduction of some epigraphs in the minimization process. In the following, we provide some results concerning the projection onto the epigraph of a convex function.

Proposition 3.2.

Let be a real Hilbert space and let be equipped with the standard product space norm. Let be a function in such that is open. The projector onto the epigraph of is given by:

 (∀(y,ζ)∈H×R)Pepiφ(y,ζ)=(p,θ) (17)

where

 {p=prox12(max{φ−ζ,0})2(y),θ=max{φ(p),ζ}. (18)
Proof.

For every , let . If , then and . In addition,

 (∀u∈H)0 =12∥p−y∥2+12(max{φ(p)−ζ,0})2 ≤12∥u−y∥2+12(max{φ(u)−ζ,0})2, (19)

which shows that (18) holds. Let us now consider the case when . From the definition of the projection, we get

 (p,θ)=argmin(u,ξ)∈epiφ∥u−y∥2+(ξ−ζ)2. (20)

From the Karush-Kuhn-Tucker theorem [71, Theorem 5.2], 000By considering and , the required qualification condition is obviously satisfied. there exists such that

 (p,θ)=argmin(u,ξ)∈H×R12∥u−y∥2+12(ξ−ζ)2+α(φ(u)−ξ) (21)

where the Lagrange multiplier is such that . Since the value is not allowable (since it would lead to and ), it can be deduced from the above equality that . In addition, differentiating the Lagrange functional in (21) w.r.t. yields

 φ(p)=θ=ζ+α≥ζ. (22)

Hence, given by (20) is such that

 p =argminu∈Hφ(u)≥ζ∥u−y∥2+(φ(u)−ζ)2 (23) θ =φ(p)=max{φ(p),ζ}. (24)

Furthermore, as , we have

 infu∈Hφ(u)≤ζ∥u−y∥2=∥Plev≤ζφ(y)−y∥2=infu∈Hφ(u)=ζ∥u−y∥2 (25)

where we have used the fact that belongs to the boundary of which is equal to since is lower-semicontinuous and is open [44, Corollary 8.38]. We have then

 infu∈Hφ(u)≤ζ∥u−y∥2 =infu∈Hφ(u)=ζ∥u−y∥2 ≥infu∈Hφ(u)≥ζ∥u−y∥2+(φ(u)−ζ)2. (26)

Altogether, (23) and (3.2) lead to

 p=argminu∈H12∥u−y∥2+12(max{φ(u)−ζ,0})2 (27)

which is equivalent to (18) since . ∎

Note that alternative characterizations of the epigraphical projection can be found in [44, Propositions 9.17, 28.28].

From the previous proposition, we see that the proximity operator in (18) plays a prominent role in the calculation of the projection onto . We now provide an example of function for which this proximity operator admits a simple form.

Proposition 3.3.

Let , . Assume that

 (∀y∈R)φ(y)=τ|y|β. (28)

If , then for every

 prox12(max{τ|⋅|β−ζ,0})2(y)=⎧⎪⎨⎪⎩sign(y)1+τ2max{|y|+τζ,0},if β=1,sign(y)χ0,if β>1, (29)

where is the unique solution on of the equation

 βτ2χ2β−1−βτζχβ−1+χ=|y|. (30)

If , then, for every ,

 prox12(max{τ|⋅|β−ζ,0})2(y)={y,if τ|y|β≤ζ,sign(y)χζ,otherwise, (31)

where is the unique solution on of (30).

Proof.

Since is an even function, is an odd function [11, Remark 4.1(ii)]. In the following, we thus focus on the case when . If , then . When , and, from [11, Example 4.6], it can be deduced that

 prox12(max{φ−ζ,0})2(y)=11+τ2max{y+τζ,0}. (32)

When , is differentiable and, according to (5), is uniquely defined as

 p−y+βτpβ−1(τpβ−ζ)=0 (33)

where, according to [72, Corollary 2.5], . This allows us to deduce that .

Let us now focus on the case when . If , it can be deduced from [72, Corollary 2.5], that . Since , (5) yields . On the other hand if , as the proximity operator of a function from to is continuous and increasing [72, Proposition 2.4], . Since is differentiable in this case, and , (5) allows us to deduce that is the unique value in satisfying (33). It can be concluded that, when , (31) holds. ∎

Note that, when is a rational number, (30) is equivalent to a polynomial equation for which either closed form solutions are known or standard numerical solutions exist.

3.3 Examples of epigraphical projections

The previous propositions allow us to establish the following results concerning the epigraphical projection in (16): 000We drop the subscript from , and in order to relieve the notations.

• Distance functions of the form

 (∀y(ℓ)∈RM(ℓ))h(ℓ)(y(ℓ))=τ(ℓ)dβ(ℓ)C(ℓ)(y(ℓ)) (34)

where , , , and is a nonempty closed convex subset of .

Proposition 3.4.

Assume that is given by (34). Then, for every , where

 p(ℓ)={y(ℓ),if y(ℓ)∈C(ℓ),α(ℓ)y(ℓ)+(1−α(ℓ))PC(ℓ)(y(ℓ)),otherwise, (35)

and , where

 α(ℓ)=prox12(max{τ(ℓ)|⋅|β(ℓ)−ζ(ℓ),0})2(dC(ℓ)(y(ℓ)))dC(ℓ)(y(ℓ)) (36)

and the expression of is provided by Proposition 3.3.

Proof.

Let us notice that where . According to [13, Proposition 2.7], for every ,

 proxψ(ℓ)∘dC(ℓ)(y(ℓ))=⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩y(ℓ),if y(ℓ)∈C(ℓ),PC(ℓ)(y(ℓ)),if dC(ℓ)(y(ℓ))≤max∂ψ(ℓ)(0),α(ℓ)y(ℓ)+(1−α(ℓ))PC(ℓ)(y(ℓ)),if dC(ℓ)(y(ℓ))>max∂ψ(ℓ)(0) (37)

where . In addition, we have

 ∂ψ(ℓ)(0)={[τ(ℓ)ζ(ℓ),−τ(ℓ)ζ(ℓ)],if ζ(ℓ)<0 and β(ℓ)=1,{0},otherwise, (38)

and, according to Proposition 3.3, when , and , . These show that (37) reduces to (35). ∎

• Euclidean norms, as particular case of distance functions in (34) when and :

 (∀y(ℓ)∈RM(ℓ))h(ℓ)(y(ℓ))=τ(ℓ)∥y(ℓ)∥ (39)

where and . The resulting epigraphical projection is given below.

Corollary 3.5.

Assume that is given by (39).
Then, for every ,

 (40)

where .

The epigraph of the Euclidean norm is the so-called Lorentz convex symmetric cone [73, 74] and the above result is actually known in the literature [75]. As it will be shown in Section 4, this expression of the epigraphical projection is useful to deal with multivariate sparsity constraints [76] or total variation bounds [27, 77], since such constraints typically involve a sum of functions like (39) composed with linear operators corresponding to analysis transforms or gradient operators.

• Infinity norms defined as: for every and ,

 h(ℓ)(y(ℓ))=max{|y(ℓ,m)|τ(ℓ,m)∣1≤m≤M(ℓ)}. (41)

where .

Proposition 3.6.

Assume that is given by (41) where the values are in ascending order, and set and . Then, for every , is such that , with

 p(ℓ,m)=⎧⎪⎨⎪⎩y(ℓ,m),if\; |y(ℓ,m)|≤τ(ℓ,m)θ(ℓ),τ(ℓ,m)θ(ℓ),if\; y(ℓ,m)>τ(ℓ,m)θ(ℓ),−τ(ℓ,m)θ(ℓ),if\; y(ℓ,m)<−τ(ℓ,m)θ(ℓ), (42)
 θ(ℓ)=max(ζ(ℓ)+∑M(ℓ)m=¯¯¯¯m(ℓ)ν(ℓ,m)(τ(ℓ,m))2,0)1+∑M(ℓ)m=¯m(ℓ)(τ(ℓ,m))2, (43)

and is the unique integer in such that

 ν(ℓ,¯¯¯¯m(ℓ)−1)<ζ(ℓ)+∑M(ℓ)m=¯¯¯¯m(ℓ)ν(ℓ,m)(τ(ℓ,m))21+∑M(ℓ)m=¯¯¯¯m(ℓ)(τ(ℓ,m))2≤ν(ℓ,¯¯¯¯m(ℓ)). (44)
Proof.

For every , in order to determine we have to find

 minθ(ℓ)∈[0,+∞[((θ(ℓ)−ζ(ℓ))2+min|p(ℓ,1)|≤τ(ℓ,1)θ(ℓ)…|p(ℓ,M(ℓ))|≤τ(ℓ,M(ℓ))θ(ℓ)∥p(ℓ)−y(ℓ)∥2). (45)

For every , the inner minimization is achieved when, for every , is the projection of onto , which is given by (42). Then, the problem reduces to

 minimizeθ(ℓ)∈[0,+∞[((θ(ℓ)−ζ(ℓ))2+M(ℓ)∑m=1(max{|y(ℓ,m)|−τ(ℓ,m)θ(ℓ),0})2) (46)

which is also equivalent to calculate , where is such that, for every ,

 ϕ(ℓ)(v)=12M(ℓ)∑m=1(max{τ(ℓ,m)(ν(ℓ,m)−v),0})2. (47)

By using [12, Proposition 12], we have . The function belongs to since for every , is finite convex and is finite convex and increasing on . In addition, is differentiable and it is such that, for every and every ,

 ν(ℓ,k−1)

For every , as is characterized by (5), there exists such that and

 ζ(ℓ)−χ(ℓ)=M(ℓ)∑m=¯¯¯¯m(ℓ)(τ(ℓ,m))2(χ(ℓ)−ν(ℓ,m)). (49)

This yields , hence (43), and we have: (44) . The uniqueness of satisfying this inequality follows from the uniqueness of . ∎

When , the function in (41) reduces to the standard infinity norm for which the expression of the epigraphical projection has been recently given in [Ding2012_j_math_matrix_cone]. Note that this proposition can be employed to efficient deal with regularization which has attracted much interest recently [78, 30, 79].

4 Experimental Results

In this section, we provide numerical examples to illustrate the usefulness of the proposed epigraphical projection method. The first presented experiment focuses on applications in image restoration involving projections onto -balls where . The second experiment deals with a pulse shape design problem based on Proposition 3.4.