An enhanced Baillon-Haddad theorem for convex functions on convex sets

# An enhanced Baillon-Haddad theorem for convex functions on convex sets

Pedro Pérez-Aros and Emilio Vilches Universidad de O’Higgins, Instituto de Ciencias de la Ingeniería, Av. Libertador Bernardo O’Higgins 611, Rancagua, 2841959, Chile Universidad de O’Higgins, Instituto de Ciencias de la Educación and Instituto de Ciencias de la Ingeniería, Av. Libertador Bernardo O’Higgins 611, Rancagua, 2841959, Chile
###### Abstract.

In this paper, we prove the Baillon-Haddad theorem for Gâteaux differentiable convex functions defined on open convex sets of arbitrary Hilbert spaces. Formally, this result establishes that the gradient of a convex function defined on an open convex set is -Lipschitz if and only if it is -cocoercive. An application to convex optimization through dynamical systems is given.

## 1. Introduction

Let be a Hilbert space endowed with a scalar product , induced norm and unit ball . Given a nonempty set and , we say that an operator is -cocoercive if for all

 (1) β⟨Tx−Ty,x−y⟩≥∥Tx−Ty∥2,

and is -Lipschitz continuous if for all

 (2) ∥Tx−Ty∥≤β∥x−y∥.

If , then (1) means that is firmly nonexpansive and (2) that is nonexpansive (see, e.g., [5, Chapter 4]). It is clear that (1) implies (2), while the converse, in general, is false (take for example ). Despite of this negative result, the Baillon-Haddad theorem ([3, Corollaire 10]) states that if is the gradient of a convex function, then (1) and (2) are equivalent. The precise statement is the following:

###### Theorem 1.1 (Baillon-Haddad).

Let be convex, Fréchet differentiable on , and such that is -Lipschitz continuous for some . Then is -cocoercive.

This prominent result provides an important link between convex optimization and fixed-point iteration . Moreover, it has many applications in optimization and numerical functional analysis (see, e.g., [2, 5, 9, 16, 17]). An improved version of Theorem 1.1 appeared in  (see also [8, Theorem 1.2]), where the authors relate the Lipschitzianity of the gradients of a convex function with the convexity and Moreau envelopes of associated functions (see [4, Theorem 2.1]). Furthermore, they provided the following Baillon-Haddad theorem for twice continuously differentiable convex functions defined on open convex sets.

###### Theorem 1.2.

[4, Theorem 3.3] Let be a nonempty open convex subset of , let be convex and twice continuously Fréchet differentiable on , and let . Then is -Lipschitz continuous if and only if it is -cocoercive.

Finally, the authors left as an open question the validity of Theorem 1.2 for Gâteaux differentiable convex functions (see [4, Remark 3.5]).

The aim of this paper is to extend Theorem 1.2 to merely Gâteaux differentiable convex functions (see Theorem 3.1). To do that, we first establish the result in finite-dimensions and then we use a finite dimensional reduction.

We emphasize that extend Theorem 1.2 is of interest because it provides an important link between the gradient of convex functions defined on convex sets and cocoercive operators defined on convex sets.

Cocoercivity arises in various areas of optimization and nonlinear analysis (see, e.g., [1, 5, 6, 11, 14]). In particular, it plays an important role in the design of algorithms to solve structured monotone inclusions (which includes fixed points of non-expansive operators). Indeed, let us consider the structured monotone inclusion: find such that

 (3) 0∈∂Φ(x)+Bx,

where is a convex lower semicontinuous function and is a monotone operator. It is well known that (see, e.g., ) the problem (3) is equivalent to the fixed point problem: find such that

 (4) x=proxμΦ(x−μBx),

where and is the proximal mapping of (see, e.g., [5, Definition 12.23]) defined by

To solve the fixed point problem (4), Abbas and Attouch  introduces the following dynamical system

 (5) ˙x(t)+x(t)=proxμΦ(x(t)−μBx(t)),

whose equilibrium points are solutions of (4). They proved the following result (see [1, Theorem 5.2])

###### Proposition 1.3.

Let be a convex lower semicontinuous proper function, and a maximal monotone operator which is -cocoercive. Suppose that and

 zer(∂Φ+B):={z∈H:0∈∂Φ(z)+Bz}≠∅.

Then the unique solution of (5) weakly converges to some element .

The previous result was extended by Boţ and Csetnek (see [6, Theorem 12]) to solve the monotone inclusion: find such that

 (6) 0∈Ax+Bx,

where is a maximal monotone operator and is -cocoercive. These two results were extended by the authors in , where we proved the strong convergence of a Tikhonov regularization for the dynamical system (5). It is important to emphasize that in order to solve the problems (4) and (6), it is enough that the operator is defined in and , respectively. Therefore, it is interesting to have characterizations of cocoercive operators defined on open convex subsets of . Thus, it is important to extend Theorem 1.2 to merely Gâteaux differentiable functions (see [4, Remark 3.5]).

The paper is organized as follows. After some preliminaries, in Section 3 we state and prove the main result of the paper (Theorem 3.1). Then, in Section 4 we give an application to convex optimization.

## 2. Notation and Preliminaries

Let be a Hilbert space endowed with a scalar product , induced norm and unit ball . We denote by the set of continuous linear operators from into . The norm of an operator is defined by

 ∥A∥L(H,H):=suph∈B∥Ah∥.

Given an open convex set , we denote by the class of Fréchet differentiable functions whose gradient is locally Lipschitz (see, e.g., [15, Chapter 9]).

Given and , we say that is -cocoercive on if for all

 β⟨Tx−Ty,x−y⟩≥∥Tx−Ty∥2.
###### Example 1.

The following list provides some examples of cocoercive operators (see refer to [5, Chapter 4] for further properties on cocoercive operators):

1. is nonexpansive if and only if is -cocoercive.

2. is -cocoercive if and only if is -Lipschitz.

3. A matrix is psd-plus (that is, for some positive definite) if and only if the mapping is cocoercive (see [17, Proposition 2.5]).

4. The Yosida approximation of a maximal monotone operator is -cocoercive (see [5, Corollary 23.11]).

For a convex function we consider the convex subdifferential of at as

 ∂f(x):={y∈H:f(x)+⟨∇f(x),y−x⟩≤f(y) for all y∈H}.

It is well-known that for two functions the following equality holds (see, e.g., [5, Corollary 16.48]):

 (7) ∂(f+g)(x)=∂f(x)+∂g(x).

To prove our main result, we will use finite dimensional reduction arguments, thus, some elements of generalized differentiation in finite dimensions will be needed. We refer to  for more details.

Let be a function. For , we define the Generalized Hessian of at (see, e.g., [15, Theorem 9.62] and ) as the set of matrices

 ¯¯¯¯¯∇2f(¯x):={A∈Rn×n∣∃xn→¯x,xn∈D,∇2f(xn)→A},

where is the dense set of points where is twice differentiable (by virtue of Rademacher’s theorem the set exists). The following result (see [15, Theorem 13.52]) establishes some properties of the Generalized Hessian .

###### Proposition 2.1.

Let be a function, where is a open set. Then is a nonempty, compact set of symmetric matrices.

The following result gives a known characterization of convexity and Lipschitzianity of functions (see, e.g., [15, 12]). We give a proof for completeness.

###### Proposition 2.2.

Let be a function with convex. Then

1. is convex if and only if for all and all one has

 ⟨Au,u⟩≥0 for all u∈Rn.
2. is -Lipschitz on if and only if for all and all the inequality holds.

###### Proof.

(i) follows from [12, Example 2.2]. The necessity in (ii) is direct. To prove the sufficiency in (ii), it is enough to assume that for all and all the inequality holds. Fix and consider the function . Then, is locally Lipschitz on and

 ∇gy(x) =⟨∇2f(x),y⟩ a.e. x∈Ω.

Thus,

 supw∈¯¯¯¯∇gy(x)∥w∥ ≤∥y∥ for all x∈Ω.

Hence, according to [13, Theorem 3.5.2], the map is -Lipschitz on . Finally, by virtue of [15, Exercise 9.9], we conclude that is -Lipschitz on

## 3. An enhanced Baillon-Haddad theorem

In this section, we state and prove the main result of the paper, that is, the Baillon-Haddad theorem for convex functions defined on convex sets, which extends [4, Theorem 3.3] and solves the question posed in [4, Remark 3.5].

###### Theorem 3.1.

Let be a nonempty open convex subset of a Hilbert space , let be a convex function and . Then the following are equivalent.

1. is Gâteaux differentiable on and is -Lipschitz continuous on .

2. the map is convex on .

3. is Gâteaux differentiable on and is -cocoercive.

Moreover, if any of the above conditions hold, then .

To prove Theorem 3.1, we show first the result in finite dimension under the additional assumption that (see the next lemma). Then, we obtain Theorem 3.1 in finite dimensional spaces (see Lemma 3.3). Finally, the proof of Theorem 3.1 follows from finite dimensional reductions and Lemma 3.3.

###### Lemma 3.2.

Let be a nonempty open convex subset of , let be a convex function and . Then the following are equivalent.

1. is -Lipschitz continuous on .

2. the map is convex on .

3. is -cocoercive.

###### Proof.

Let us consider the functions and . It is clear that

 (8) A∈¯¯¯¯¯Hf/β(x))⇔B:=I−A∈¯¯¯¯¯Hg(x).

On the one hand,

 ∇f is β-Lipschitz continuous ⇔(∀x∈Ω)(∀A∈¯¯¯¯¯Hf/β(x))∥A∥≤1 (by Proposition ??? (ii)) ⇔(∀x∈Ω)(∀A∈¯¯¯¯¯Hf/β(x))(∀u∈Rn)0≤⟨u,Au⟩≤∥u∥2 (by Proposition ??? (i)) ⇔(∀x∈Ω)(∀A∈¯¯¯¯¯Hf/β(x))(∀u∈Rn)0≤∥u∥2−⟨u,Au⟩ ⇔(∀x∈Ω)(∀B∈¯¯¯¯¯Hg(x))(∀u∈Rn)0≤⟨u,Bu⟩ (by (???)) ⇔g is convex ( by Proposition ??? (i)),

which shows that is equivalent to .
On the other hand,

 g is convex, ⇔(∀x∈Ω)(∀B∈¯¯¯¯¯Hg(x))(∀u∈Rn)0≤⟨u,Bu⟩ (by Proposition ??? (i)) ⇔(∀x∈Ω)(∀A∈¯¯¯¯¯Hf/β(x))(∀u∈Rn)0≤∥u∥2−⟨u,Au⟩ ⇔(∀x∈Ω)(∀A∈¯¯¯¯¯Hf/β(x))(∀u∈Rn)−∥u∥2≤2⟨u,Au⟩−∥u∥2≤∥u∥2 ⇔(∀x∈Ω)(∀B∈¯¯¯¯¯Hh(x))∥B∥≤1 ⇔ the map x↦∇h(x)=2β∇f(x)−x is 1-Lipschitz (by Proposition ??? (ii)) ⇔∇f is 1/β-cocoercive (by Example ??? (???)),

which proves that is equivalent to . ∎

Now we proceed to delete the hypothesis from Lemma 3.2.

###### Lemma 3.3.

Let be a nonempty open convex subset of , let be a convex function and . Then the following are equivalent.

1. is Gâteaux differentiable on and is -Lipschitz continuous on .

2. the map is convex on .

3. is Gâteaux differentiable on and is -cocoercive.

###### Proof.

According to [7, Theorem 2.2.1], for functions defined on subsets of , Gâteaux differentiability is equivalent to Fréchet differentiablity. We proceed to show that any of the above conditions imply that . Indeed, it is clear that (a) and (c) implies that . To prove that (b) implies that , we follow some ideas from . Let us define . Thus,

 β2∥x∥2 =f(x)+h(x) x∈Ω,

which implies that for all . Therefore, and are non-empty and contain a single element. Hence, by virtue of [7, Corollary 4.2.5], the function is Gâteaux differentiable on and, thus, Fréchet differentiable on and continuously differentiable on (see [7, Theorem 2.2.2]). It is not difficult to prove that (b) implies the following inequality:

 12∥x−y∥2 ≥Df(x,y):=f(x)−f(y)−⟨∇f(y),x−y⟩≥0 for all x,y∈Ω.

Fix and define . Then and for all and . Thus, we obtain

 (9) 12∥z−x∥2 ≥Dd(z,x)=d(z)−d(x)−⟨∇d(x),z−x⟩ for all z,x∈Ω.

Fix and such that such that

 supz∈¯x+δB∥∇f(z)∥<+∞.

Let and such that . Therefore, by taking in (9) and using that , we obtain

 Df(x,y)≥t(1−t2)∥∇f(x)−∇f(y)∥2.

Analogously, we get

 Df(y,x)≥t(1−t2)∥∇f(x)−∇f(y)∥2.

Thus, for all

 ⟨∇f(x)−∇f(y),x−y⟩=Df(x,y)+Df(y,x)≥t(2−t)∥∇f(x)−∇f(y)∥2,

which shows that is Lipschitz on . Therefore,

Now we are ready to prove Theorem 3.1.
Proof of Theorem 3.1
: Let and define . We observe that is a finite dimensional Hilbert space. Thus the restriction of to , , is Gâteaux differentiable in and for all

 ⟨∇f|F(a)−∇f|F(b),h⟩=⟨∇f(a)−∇f(b),h⟩.

Hence, for all

 ∥∇f|F(a)−∇f|F(b)∥ =suph∈B∩F⟨∇f|F(a)−∇f|F(b),h⟩ ≤∥∇f(a)−∇f(b)∥ ≤β∥a−b∥,

which shows that is -Lipschitz on . Therefore, according to Lemma 3.3, the map

 x↦h(x):=β2∥x∥2−f|F(x),

is convex on . In particular, for all , for all

 h(λx+(1−λ)y)≤λh(x)+(1−λ)h(y).

Since are arbitrary, it follows that the map is convex on .
: We first observe that is convex (with finite values) and for all

 β2∥x∥2=f(x)+g(x).

Hence, by virtue of (7), for all

 βx=∂f(x)+∂h(x),

which implies that and are non-empty and contain a single element. Therefore, according to [7, Corollary 4.2.5], the function and are Gâteaux differentiable on . Thus, if is finite dimensional, then is convex on . Hence, by virtue of Lemma 3.3, is -Lispchitz on , i.e., for all

 (10) suph∈B∩F⟨∇f(x)−∇f(y),h⟩=∥∇f|F(x)−∇f|F(y)∥L(F,F)≤β∥x−y∥.

Let us consider

 Fx,y:={F⊂H:F is a linear subspace % of H with x,y∈F and dimF<+∞}.

Hence, since (10) holds for any finite dimensional, we obtain

 (11) supF∈Fx,y∥∇f|F(x)−∇f|F(y)∥L(F,F) =supF∈Fx,ysuph∈B∩F⟨∇f(x)−∇f(y),h⟩ =∥∇f(x)−∇f(y)∥L(H,H).

Therefore, by taking supremum in (10), we conclude that for all

 ∥∇f(x)−∇f(y)∥≤β∥x−y∥,

which proves .

: It is straightforward.

: Let with . Then is a Hilbert space and for all

 β⟨∇f|F(x)−∇f|F(y),x−y⟩ =β⟨∇f(x)−∇f(y),x−y⟩ ≥∥∇f(x)−∇f(y)∥L(F,F).

Hence, as a result of Lemma 3.3, for all

 ∥∇f|F(x)−∇f|F(y)∥L(F,F)≤β∥x−y∥F=∥x−y∥H.

Since is arbitrary, by using (11), we conclude that for all

 ∥∇f(x)−∇f(y)∥L(H,H)≤β∥x−y∥,

which shows the equivalence between , and . Finally, the fact that any of the above conditions imply that follows from Smulian’s theorem (see, e.g., [7, Theorem 4.2.10]).

## 4. Application to Convex Optimization

In this section, we present an application of Theorem 3.1 to convex optimization. Let be a Hilbert space, and be a convex function defined over an open convex set with

 ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯domφ⊂Ω.

We study the Tikhonov regularization for the projected dynamical system (5) (see [10, 14] for more details on Tikhonov regularization). Let us consider the following assumptions:
Assumption Let be a positive function satisfying

1. is absolutely continuous, nonincreasing and ;

2. ;

3. .

We observe that, for example, the function with satisfies the previous assumption.

###### Theorem 4.1.

Assume, in addition to Assumption , that is -Lipschitz on and . Let and be a function satisfying Assumption . Let be the unique solution of

 {−˙x(t)=x(t)−proxμφ(x(t)−μ∇ψ(x(t)))+ε(t)(x(t)−y),x(0)=x0,

where . Then converges strongly to , as .

###### Proof.

According to Theorem 3.1, the operator is -cocoercive. Thus, we can apply [14, Theorem 5.2] to obtain the desired result. ∎

## References

•  Abbas, B., Attouch, H.: Dynamical systems and forward-backward algorithms associated with the sum of a convex subdifferential and a monotone cocoercive operator. Optimization 64(10), 2223–2252 (2015)
•  Attouch, H., Briceño Arias, L., Combettes, P.L.: A parallel splitting method for coupled monotone inclusions. SIAM J. Control Optim. 48(5), 3246–3270 (200910)
•  Baillon, J.B., Haddad, G.: Quelques propriétés des opérateurs angle-bornés et -cycliquement monotones. Israel Journal of Mathematics 26(2), 137–150 (1977)
•  Bauschke, H.H., Combettes, P.L.: The Baillon-Haddad theorem revisited. J. Convex Anal. 17(3-4), 781–787 (2010)
•  Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces, second edn. CMS Books Math./Ouvrages Math. SMC. Springer, Cham (2017)
•  Boţ, R.I., Csetnek, E.R.: A dynamical system associated with the fixed points set of a nonexpansive operator. J. Dynam. Differential Equations 29(1), 155–168 (2017)
•  Borwein, J., Vanderwerff, J.: Convex functions: constructions, characterizations and counterexamples, Encyclopedia Math. Appl., vol. 109. Cambridge University Press, Cambridge (2010)
•  Byrne, C.: On a generalized Baillon-Haddad theorem for convex functions on Hilbert space. J. Convex Anal. 22(4), 963–967 (2015)
•  Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53(5-6), 475–504 (2004)
•  Cominetti, R., Peypouquet, J., Sorin, S.: Strong asymptotic convergence of evolution equations governed by maximal monotone operators with Tikhonov regularization. J. Differential Equations 245(12), 3753–3763 (2008)
•  Contreras, A., Peypouquet, J.: Asymptotic equivalence of evolution equations governed by cocoercive operators and their forward discretizations. J. Optim. Theory Appl. (2018).
•  Hiriart-Urruty, J.B., Strodiot, J.J., Nguyen, V.H.: Generalized Hessian matrix and second-order optimality conditions for problems with data. Appl. Math. Optim. 11(1), 43–56 (1984)
•  Mordukhovich, B.: Variational analysis and generalized differentiation. I, Grundlehren Math. Wiss., vol. 330. Springer-Verlag, Berlin (2006). Corrected, 2nd printing 2013
•  Pérez-Aros, P., Vilches, E.: Tikhonov regularization of dynamical systems associated with nonexpansive operators defined in closed and convex sets. Submitted. (2019)
•  Rockafellar, R.T., Wets, R.: Variational analysis, Grundlehren Math. Wiss., vol. 317. Springer-Verlag, Berlin (1998). Corrected 3rd printing 2009.
•  Tseng, P.: Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Control Optim. 29(1), 119–138 (1991)
•  Zhu, D., Marcotte, P.: Co-coercivity and its role in the convergence of iterative schemes for solving variational inequalities. SIAM J. Optim. 6(3), 714–726 (1996)
Comments 0
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters Loading ...
362586 You are asking your first question!
How to quickly get a good answer:
• Keep your question short and to the point
• Check for grammar or spelling errors.
• Phrase it like a question Test
Test description